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FOREWORD 


This  document  provides  the  methodology  used  in  the  development  of  the  MSFC  1 3-month 
smoothed  solar  flux  and  geomagnetic  index  intermediate  (months)  and  long-range  (years)  statistical 
estimation  technique.  These  estimates  are  provided  as  input  to  orbital  lifetime  models. 

Section  1  is  an  introduction  and  reason  the  estimates  are  determined.  Section  2  is  the  history  of  the 
development  of  the  computer  program  that  calculates  the  estimates.  Section  3  contains  the 
methodology  of  the  calculation  technique.  Section  4  describes  the  flow  of  the  computer  program  to 
generate  the  estimates.  Section  5  discusses  the  engineering  use  of  the  computer  program’s  products. 
Appendix  A  gives  the  linear  least  squares  model  development.  Appendix  B  is  an  overview  of  the 
cubic  connection  between  the  predicted  cycle  and  the  mean.  Appendix  C  provides  the  rationale  for 
the  statistics  used  to  calculate  the  percentile  values.  Appendix  D  provides  the  McNish-Lincoln  and 
quantile  computer  programs.  Appendix  E  contains  an  example  of  the  computer  program  product 
provided  in  the  monthly  Marshal  Space  Flight  Center  solar  activity  memorandum.  Appendix  F  gives 
an  assessment  of  the  estimation  technique  accomplished  for  a  number  of  past  solar  cycles. 

Questions  or  comments,  contact  NASA  Marshall  Space  Flight  Center  Systems  Analysis  and 
Integration  Laboratory,  Electromagnetics  and  Aerospace  Environments  Branch,  Chief,  Steven  D. 
Pearson  (205)  544-2350. 
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Technical  Memorandum 


Statistical  Technique  for  Intermediate  and  Long-Range 
Estimation  of  13-Month  Smoothed  Solar  Flux  and 
Geomagnetic  Index 


1.0  Introduction 

Because  no  generally  accepted  solar  physical  model  is  available  to  accurately  predict 
future  solar  activity,  Marshall  Space  Flight  Center  (MSFC)  developed  a  13-month  smoothed 
solar  flux  and  geomametic  index  intermediate  (months)  and  long-range  (years)  statistical 
estimation  technique  .  The  reason  for  issuing  intermediate  and  long-range  solar  activity 
estimates  is  the  need  for  updated  inputs  to  the  upper  atmosphere  density  models  used  for 
satellite  orbital  lifetime  predictions  and  performance  requirement  analyses^.  Mission  analysis 
and  planning  for  future  spacecraft  launches  and  on-orbit  operations  require  estimates  of 
orbital  lifetimes,  altitudes,  inclinations,  and  eccentricities.  This  report  documents  the  MSFC 
13-month  smoothed  solar  flux  and  geomagnetic  index  intermediate  and  long-range  statistical 
estimation  technique,  referred  to  as  the  MSFC  Lagrangian  Linear  Regression  Technique 
(MLLRT). 


2.0  Background 

Excellent  surv^s  of  various  future  solar  activity  estimation  techniques  were 
presented  by  Vitinskii ,  Scissum^,  Slutz,  et  al.^  and  Donnelly®.  Neural  Network 
applications  have  been  more  recently  made  by  Macpherson  et  al.’,  and  Calvo  et  al.®  and 
others.  The  linear  regression  method,  applied  by  McNish  and  Lincoln®,  was  modified  by 
Boykin  and  Richards™,  and  Avaritt".  A  later  improvement  applying  the  modified  McNish 
and  Lincoln  linear  regression  method  to  develop  the  MSFC  Lagrangian  Linear  Regression 
Technique  was  accomplished  by  Holland  and  Vaughan'. 

Yu.  I.  Vitinskii^  conducted  an  extensive  survey  and  analysis  of  solar  activity 
prediction  methods.  While  recognizing  the  magnitude  of  the  problem  and  encouraging 
studies  of  active  processes  taking  place  on  the  Sun  to  solve  it,  he  reiterated  the  current  status: 
"...we  have  shown  that  the  reliability  of  the  results  obtained  using  these  methods  still  leaves 
much  to  be  desired."  His  analysis,  however,  showed  the  linear  regression  method  usually 
gives  accurate  results  to  a  year  in  advance.  For  several-years-in-advance,  the  linear 
regression  method  becomes  increasingly  less  accurate. 

McNish  and  Lincoln®  suggested  that  the  estimation  of  a  sunspot  cycle’s  future 
behavior,  based  on  the  mean  approximation  of  all  past  cycles,  could  be  improved  by  adding 
to  the  mean  a  correction  proportional  to  the  departure  of  the  current  value  of  the  cycle  from 
the  mean  cycle.  They  also  recommended  the  method  not  be  used  for  making  future 
projections  longer  than  one  year.  Using  a  data  base  with  two  additional  solar  cycles,  Boykin 
and  Richards™  modified  the  McNish-Lincoln  linear  regression  method  so  the  13-month 
smoothed  relative  sunspot  number  ( R)  could  be  estimated  for  10  years  in  advance,  at 
quarterly  rather  than  yearly  intervals.  Figure  2-1  illustrates  this  method  that  is  also  applicable 

to  13-month  smoothed  solar  flux  ( F,o  7)  and  geomagnetic  index  ( Ap) . 
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Figure  2-1 .  Modified  McNish-Lincoln  Linear  Regression  Method. 

Holland,  and  Vaughan'  determined  that  better  statistical  estimations  are  possible,  in  a  chi 
square  sense,  by  selecting  the  start  and  end  of  each  set  (solar  cycle)  at  the  maximum  (or 
minimum)  and  normalizing  the  data  sets  using  a  Lagrangian  linear  regression  statistical 
technique.  This  determination  and  initialization  of  the  modified  McNish-Lincoln  linear 
regression  method  at  the  cycle’s  maximum  or  minimum  establish  the  current  MSFC  statistical 

technique  for  intermediate  and  long-range  estimation  of  F|07  and  Ap. 


3.0  MSFC  Lagrangian  Linear  Regression  Methodology 

Sections  3.1  and  3.2  present  the  methodology  used  to  develop  the  Fj^,  and  Ap,  data 
bases  and  the  modified  McNish-Lincoln  linear  regression  method  used  in  the  MSFC 
Lagrangian  Linear  Regression  Technique  (MLLRT). 

3.1  Development  of  F,(,7  and  Ap  Data  Bases 

In  order  to  use  the  modified  McNish-Lincoln  linear  regression  method  to  estimate 
future  solar  flux  and  geomagnetic  index,  F,o7  and  Ap,  data  bases  were  developed.  Two  data 
bases  for  F,o7  and  Ap  were  developed  depending  on  what  estimates  are  relative  to  time  point 
in  the  present  cycle.  Solar  cycle  minimum  estimates  use  a  maximum  to  maximum  initialized 
data  base.  Note  when  using  the  maximum  to  maximum  data  base,  the  cycles  are  identified  by 
cycle  number  at  maximum  initialization  date.  For  example,  maximum  to  maximum  cycle  18 
starts  in  May  1 947  with  the  first  half  of  cycle  1 8  and  the  last  half  of  cycle  1 9.  Solar  cycle 
maximum  estimates  use  a  minimum  to  minimum  initialized  data  ba.se.  To  develop  a  data  ba.sc 

select  the  time  of  all  maximum  points  of  the  F,(,  7  for  each  cycle,  and  use  these  same  time 
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points  for  the  A^,  or  depending  on  the  desire  estimate,  select  the  time  of  all  minimum  points 
of  the  F,(,  7  and  Ap  for  each  cycle.  Determine  all  periods,  Pj,  related  to  maximum  to 
maximum  (or  minimum  to  minimum)  for  the  cycles.  Calculate  the  mean  period,  P,  for 
cycles  initialized  at  maximum  (or  minimum).  Divide  all  cycle  periods  by  P.  This  gives  the 
mean  time  increment,  tj  =  Pj/P,  to  be  used  for  the  j*  cycle.  Note  that  may  be  less  than, 
equal  to,  or  greater  than  one  month.  Finally,  consider  each  cycle  a  group  of  data  with  m 
approximately  equal  to  the  number  of  points  in  P.  To  obtain  an  F,o7  or  Ap  value 
corresponding  to  each  of  the  m  points  per  cycle  requires  interpolation  since  points  are  not 
selected  at  precisely  one  month  intervals  for  all  cycles.  A  more  detailed  description  of  the 
data  bases  development  is  presented  in  sections  3.1.1  and  3.1.2. 


3.1.1  13-Month  Smoothed  Solar  Flux  (Fjo^)  Data  Base 

Although  some  researchers  believe  they  have  sufficient  reason  to  separate  the  data  for 
sunspot  cycles  1  through  8  from  th^total  data  base,  the  MSFC  Lagrangian  Linear  Regression 
Technique  for  estimation  of  future  Fjo  ^  uses  the  observed  data  for  all  observed  cycles. 
Including  cycles  1  through  8  provides  information  applicable  to  the  apparent  behavior  of  the 
cycle  period  and  to  the  overall  magnitude  during  this  time  frame  as  well  as  a  larger  data  base 
for  statistical  estimates.  The  measured  Fjo  ^  data  base  was  extended  back  to  1749  by  using 

Wolfs  relative  sunspot  values,  R,  and  a  R  to  F,o7  conversion  equation.  R  is  defined  by  the 
equation: 

R=k(10g  +  f)  (1) 

where  R  is  the  Wolf  number,  k  is  a  correction  factor  to  equalize  counts  from  different 
observers,  g  is  the  number  of  groups  visible  on  a  given  day,  and  f  is  the  number  of  single 
spots  observed  on  a  given  day'^.  The  R  values  were  smoothed  using  the  Zurich  13-month 
smoothing  equation: 


Ri  =  — 
‘  12 


i+5 


(2) 


where  i  indicates  the  month  of  interest.  This  smoothing  technique  was  developed  by  the 

Swiss  Federal  Observatory,  Zurich,  Switzerland*^.  Once  R  values  are  smoothed  to  R 
values,  the  following  equation  (developed  by  MSFC  in  collaboration  with  Jack  Slowey  of  the 
Smithsonian  Astrophysical  Observatory)  converts  recorded  R  data  to  F,q7  data: 

F,o7  =  49.4  +  0.97  R  +  17.6e(-®°^^^>  (3) 


Equation  (3),  derived  from  recorded  F,o7  and  R  data  from  1947  to  1978,  has  a  linear 

correlation  coefficient  of  0.98.  Figure  3-1,  a  plot  of  equation  (3)  with  R  and  F,o7  data  to 
1995,  shows  that  equation  (3)  is  still  adequate  for  today’s  applications. 
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Figure  3-1.  R  versus  with  Curve  Fit  from  Conversion  Equation. 

Since  1947,  observed  values  of  daily  solar  flux  are  used  to  compile  mean  monthly  Fjpj 
values.  Use  equation  (2),  replacing  R  with  Fj^,,  to  calculate  the  Fi^,.  Figure  3-2  is  a  plot  of 
converted  and  observed  F,qj  data  for  all  cycles  since  1749. 


Year 


Figure  3-2.  13-Month  Smoothed  (Converted  and  Observed)  Solar  Flux  ( Fjoj) . 

All  solar  flux  cycles  in  the  regression  technique  data  base  prior  to  1947  use,  as  the 
starting  point,  the  established  R  maximum  or  minimum.  After  1946,  the  maximum  or 
minimum  is  based  on  the  observed  F,o7  maximum  or  minimum  values  computed  from  the 
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measured  values'.  Table  3-1  has  the  minimum  and  maximum  starting  point  years  and 
period  of  each  cycle  with  associated  converted  or  observed  values.  Data  format  is  vear 

TaSiSSte  moS^SsTs'' "  ^.917.  The  equation  to 


month  decimal  value  =  (month  number  - 1)/12 


(4) 


Table  3-1.  Solar  Cycle  Elements  Information  in  Data  Base 


Cycle 

Year/Month 

of 

Minimum 

Year/Month 

of 

Maximum 

Minimum 
Solar  Flux 
Value* 

Maximum 
Solar  Flux 
Value* 

Rise  Time 
in 

Year/Month 

Fall  Time 
in 

Year/Month 

Min.  to  Min. 
Period 

(Year/Month) 

Max.  to  Max 
Period 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 

1755.083 

1766.417 

1775.417 

1784.667 

1798.250 

1810.500 
1823.333 
1833.833 

1843.500 

1855.917 

1867.167 

1878.917 

1890.167 
1902.000 

1913.500 
1923.583 

1933.667 
1944.083 

1954.250 
1964.750 
1976.417 

1986.667 

1750.250 

1761.417 

1769.667 

1778.333 
1788.083 
1805.083 

1816.333 
1829.833 

1837.167 
1848.083 
1860.083 

1870.583 
1883.917 
1894.000 
1906.083 

1917.583 

1928.250 

1937.250 

1947.333 

1958.167 
1970.500 

1981.333 
1989.417 

70.66 

72.13 

70.06 

71.26 

68.22 

67.00 

67.03 

70.12 

71.78 

68.23 

69.11 

67.85 

69.01 

68.02 

67.54 

69.30 

68.36 

70.33 

69.69 

72.59 

73.27 

72.90 

139.87 

134.13 

162.02 

203.26 
186.47 

100.26 
99.84 

120.35 

191.99 

177.61 

144.97 

185.82 

123.04 

135.50 

113.54 
152.07 
126.32 
165.30 
214.39 
245.45 
156.34 

204.55 
213.11 

6.334 
3.250 
2.916 
3.416 

6.833 

5.833 
6.500 

3.334 

4.583 
4.166 
3.416 
5.000 

3.833 
4.083 

4.083 

4.667 

3.583 
3.250 
3.917 

5.750 
4.916 

2.750 

4.833 
5.000 
5.750 
6.334 

10.167 

5.417 

7.000 

4.000 

6.333 

7.834 

7.084 

8.334 
6.250 
8.000 

7.417 
6.000 

5.417 
6.833 

6.917 
6.583 

5.917 

5.334 

11.334 

9.000 

9.250 

13.583 

12.250 

12.833 

10.500 
9.667 

12.417 

11.250 
11.750 

11.250 

11.833 

11.500 
10.083 
10.084 
10.416 
10.167 

10.500 
11.667 

10.250 

11.167 

8.250 

8.666 

9.750 

17.000 

11.250 

13.500 
7.334 

10.916 

12.000 

10.500 
13.334 

10.083 

12.083 

11.500 
10.667 
9.000 

10.083 

10.834 

12.333 

10.833 

8.084 

nummary: 

Median  (50%) 

Mean 

Standard  Deviation  (a) 

69.50 

69.75 

1.93 

156.34 

160.70 

39.92 

4.08 

4.38 

1.21 

6.33 

6.49 

1.36 

11.25 

11.03 

1.19 

10.83 

10.87 

2.12 

^^onvertea  pnor  to  1 947  and  observed  after  1 946 

A  .  f  converted  and  observed  Fjo^  data  are  Lagrangian  interpolated  to  normalize  the 
data  tor  the  132  months  from  the  maximum  or  minimum  cycle  starting  dates.  The  data  is 
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3.1.2  13-Month  Smoothed  Geomagnetic  Index  (Ap)  Data  Base 

Because  the  measured  geomagnetic  index  (Ap)  data  base  is  relatively  short  (1932  to 
1996),  it  was  extended  back  to  1884  using  mean  monthly  magnetic  character  figure  (Cj)  data. 
The  niean  monthly  C  data  is  converted  to  13-month  smoothed  data  using  equation  (2)  and 
replacing  R  with  Q.  Once^e  mean  monthly  Q  is  converted  to  13-month  smooth^ 
magnetic  character  figure  ( Cj),  use  equation  (5)  to  convert  the  extended  record  of  Cj  data  to 

Ap  values. 

Ap  =  2.8068e''''^'  (5) 

This  equation,  derived  from  recorded'^  C-  and  Ap  data  from  1932  to  196^ has  a  linear 
correlation  coefficient  of  0.80.  Figure  3-3  is  a  plot  of  equation  (5)  with  and  Ap  from 
1932  to  1963. 
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Figure  3-3.  Cj  versus  Ap  with  Curve  Fit  from  Conversion  Equation. 


After  1931,  the  measured  values  of  daily  A^  are  use^to  compute  the  mean  monthly  value. 
Use  equation  (2),  replacing  R  with  Ap,  to  calculate  Ap.  Figure  3-4  is  a  plot  of  converted  and 
recorded  Ap  data  for  all  cycles  since  1884. 
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Year 

Figure  3-4.  13-Month  Smoothed  (Converted  and  Recorded)  Geomagnetic  Index  ( Ap). 

The  converted  and  recorded  Ap  data  are  Lagrangian  interpolated  to  normalize  the  data 
for  the  132  months  from  the  maximum  or  minimum  cycle  starting  dates.  The  data  is  stored 
by  month  and  cycle  number  to  constmct  a  data  base  for  use  in  the  modified  McNish-Lincoln 
linear  regression  method. 

3.2  Modified  McNish-Lincoln  Linear  Regression  Method 

The  MLLRT  uses  the  Boykin  and  Richards’”  modified  McNish-Lincoln  linear 
regression  method  (for  the  linear  least  squares  method  development  see  appendix  A)  and  an 
s^ropriately  constracted  data  base  that  starts  at  the  maximum  or  minimum  to  estimate  the 
balance  of  the  present  cycle  where  die  cycle  is  defined  from  minimum  to  minimum  or 
maximum  to  maximum.  This  method  is  summarized  in  the  following  steps: 

1 .  Mean  Fj^,  or  Ap  is  calculated  from  the  completed  cycles  in  the  F,o7  or  Ap  data 
base  constructed  using  the  Lagrangian  interpolated  data  points  for  use  in  the  McNish-Lincoln 
linear  regression  method.  This  mean  also  estimates  F^j  or  Ap  for  the  next  cycle  with  P . 

2 .  McNish-Lincoln  linear  regression  mediod  produces  a  statistical  estimate  for  the 
rest  of  the  present  cycle  using  one  linear  coefficient.  The  period  for  the  present  cycle,  for 

which  estimates  of  solar  activity  are  being  calculated,  is  the  P . 

3 .  Since,  for  the  present  cycle,  only  21  or  22  corresponding  points  are  available  for 
a  linear  regression  fit  of  the  estimated  point  to  the  last  observed  point,  to  justify  calculating  a 
standard  deviation  based  on  a  normal  distribution  function  is  difficult.  TTiis  non-normal 
distribution  function  produces  upper  and  lower  bounds  that  can  and  do  go  below  the 
parameter  physical  limits.  Despite  being  a  non-normal  distribution,  the  data  are  standardized 
to  make  calculations  easier.  The  actual  distribution  of  deviations  from  the  smoothed  linear 
regression  line  and  mean  line  is  divided  by  the  standard  deviation  and  used  to  determine  the 
upper  and  lower  bounds  at  predetermined  percentile  levels.  Upper  and  lower  bounds  are 
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calculated  by  the  Quantile  method.  The  equation  u.sed  is: 


Q(Xi)  =  i/(n+l)  (6) 

where  Q  is  quantile,  i  equals  1  through  the  total  number  of  completed  cycles,  and  n  is  equal 
to  the  total  number  of  completed  cycles.  Once  the  quantile  is  calculated  the  percentile  is: 

Percentile(y)  =  100.0  Q(x)  (7) 

Rationale  for  using  the  Quantile  method  is  discussed  in  appendix  C. 

4.  Between  the  upper  and  lower  bounds  discussed  in  step  3  is  the  "error  space"  on  a 
two-dimensional  plot  of  F,o7  or  Ap  versus  time,  t. 


4.0  Application  of  MSFC  Lagrangian  Linear  Regression  Technique 

The  computer  programs  to  calculate  statistical  estimates  for  future  F,(,7  and  Ap  cycles 
and  to  perform  subsequent  statistical  analyses  were  developed  in  FORTRAN.  The  following 
sections  describe  how  the  programs  implement  the  MLLRT.  MLLRT  consists  of  three  main 

programs:  (1)  the  F,o7  and  Ap  preprocessor  computer  program  (PREPROC),  (2)  modified 
McNish-Lincoln  linear  regression  computer  program  (FORECAST),  and  (3)  statistical 
estimate  output  report  computer  program  (REPORT).  The.se  programs  are  used  to  produce 
the  MSFC  monthly  solar  activity  memorandum  data  (see  appendix  E).  Figure  4-1  is  a 
summary  block  diagram  showing  the  flow  of  files  through  these  programs. 
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4.1  F,(,7  and  Ap  Preprocessor  Computer  Program 

PREPROC  uses  mean  monthly  R,  F,o ,,  Q,  and  Ap  data  to  supply  the  linear  regression 
program  with  13-month  smoothed  data  initialized  at  the  cycle’s  maximum  or  minimum  solar  activity 

dates.  Table  3-1  summarizes  the  initialization  F,o  7  dates  for  present  and  past  solar  cycles.  Data  are 

initialized  at  the  F,o7  maximum  (minimum)  dates  to  enable  the  program  to  better  estimate  duration  of 
the  present  cycle  and  when  its  minimum  (maximum)  will  occur.  Once  the  minimum  (maximum)  is 
determined,  the  preprocessor  supplies  the  linear  regression  computer  program  with  13-month 
smoothed  data  initialized  at  the  minimum  (maximum)  solar  activity  date  for  the  computer  program  to 
provide  a  better  estimate  for  the  “new”  present  cycle  and  occurrence  of  its  maximum  (minimum). 
Steps  in  section  3. 1  are  the  methodology  for  the  preprocessor  computer  program.  How  individual 
files  are  generated  for  FORECAST  is  given  in  the  next  paragraph. 

PREPROC  starts  with  the  monthly  mean  values  of  R  and  F,o7  input  data  (flxinput.max  or 
flxinput.min)  or  the  Cj  and  Ap  input  data  (apinput.max  or  apinput.min).  The  flxinput.max  or 
flxinput.min  data  have  two  sources:  the  monthly  mean  R  data  from  M.  Waldmeier’s  book'^,  and  the 
National  Research  Council  of  Canada  monthly  mean  F,o7  data.  Both  sets  of  data  are  smoothed  by 

equation  (2)  in  section  3.1.1  and  R  data  are  converted  to  F,(,7  by  equation  (3).  The  apinput.max  or 
apinput.min  data  have  two  sources:  the  monthly  mean  geomagnetic  Cj  data  from  the  Handbook  of 
Geophysics  and  Space  Environments'^  and  the  Institute  for  Geophysics  in  Gottingen,  Germany 

monthly  mean  A^^ata.  Both  sets  of  data  are  smoothed  by  equation  (2)  in  section  3.1.1  and  Cj  data 
are  converted  to  Ap  data  by  equation  (5).  Once  the  F,o7  or  Ap  sets  are  complete,  the  data  are 
grouped  into  cycles  then  normalized  to  132  points  by  a  3-point  Lagrange  interpolation  subroutine 
(COMB_YLGINT).  F|0  7  or  Ap  output  from  the  preprocessor  program  is  used  in  a  create  file 
subroutine  (CREAT_BLOCK)  which  formats  the  normalized  data  into  a  new  output  file 
(fluxmax.dat  or  fluxmin.dat)  or  (apmax.dat  or  apmin.dat)  for  FORECAST.  Figure  4-2  is  a 
summary  block  diagram  showing  the  flow  of  files  through  PREPROC  and  its  subroutines  to  create 
the  input  file  for  FORECAST. 
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Figure  4-2.  and  Ap  Prq)rocessor  Computer  Program  Block  Diagram. 
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4.2  Modified  McNish-Lincoln  Linear  Regression  Computer  Program 

FORECAST  estimates  F,o  7  and  Ap  for  the  rest  of  the  minimum  to  minimum  or 
maximum  to  maximum  present  cycle  using  the  methodology  discussed  in  section  3.2.  A 
description  of  generating  individual  files  via  FORECAST  follows. 

FORECAST  reads  the  F,o7  or  Ap  data  (fluxmax.dat,  fluxmin.dat  or  apmax.dat, 
apmin.dat)  and  standardizes  completed  cycles  data.  F,o7  or  Ap  data  enter  the  modified  McNish- 
Lincoln  subroutine  (LINCMC)  which  first  calculates  the  mean  of  the  completed  cycles. 
LINCMC  calculates  next  the  difference  between  the  actual  and  mean  used  in  calculating  the 
variance  matrix  (A)  and  covariance  matrix  (B).  LINCMC  uses  an  inverse  matrix  subroutine 
(GJR)  to  calculate  the  A  '  matrix  and  subsequently  the  regression  coefficient  matrix  (C).  Using 
the  C  matrix  LINCMC  estimates  the  rest  of  the  present  cycle  (step  2  section  3.2).  Using  the 
historical  data  base,  the  next  cycle  is  estimated  (step  1  section  3.2).  Percentiles  subroutine 
(PRCNTILE)  calculates  the  95.0  and  5.0  percentile  values  for  the  present  cycle.  PRCNTILE 
divides  the  difference  data  by  the  standard  deviation  then  calculates  the  desired  quantile  value  by 
the  quantile  subroutine  (QUANTILE).  QUANTILE  arranges  the  data  in  ascending  order  in 
subroutine  (RANKIT)  then  calculates  the  quantile  values  (equation  6  in  section  3.2  step  3). 
PRCNTILE  enters  the  find-the-percentile-subroutine  (FNDPRCNT)  and  calculates  the  needed 
percentile  value.  FNDPRCNT  determines  the  location  of  the  desired  quantile  value  then  uses  a 
linear  interpolation  subroutine  (INTERP)  if  the  value  is  not  the  exact  quantile  calculated  by 
QUANTILE.  Once  the  quantile  values  are  determined,  each  is  multiplied  by  100  for  the 
percentile  value.  FORECAST  next  enters  a  smoothing  subroutine  (PCHMAX  or  PCHMIN). 
The  find-cubic-inflection-point  subroutine  (INFLBS)  locates  the  cubic  inflection  point  between 
the  McNish-Lincoln  estimate  and  the  best  statistical  estimate.  The  percentiles  have  a  cubic 
inflection  point  also.  The  cubic  curve  fit  subroutine  (CUBFIT)  smoothes  the  curve  between  the 
two  estimates  (see  appendix  B  for  cubic  connection  theory).  Once  smoothed,  the  final 
combined  data  are  formatted  for  plots  and  tables.  Figure  4-3  is  the  modified  McNish-Lincoln 
linear  regression  computer  program  block  diagram.  Appendix  D  is  a  listing  of  LINCMC  and 
QUANTILE  subroutines. 
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Read  in  Input 
(FLUXMAX.DAT, 
FLUXMIN.DAT, 
APMAX.DAT, 
APMDSf.DAT ) 


Find  the  Standard 
Deviation  of  the 
completed  cycles 


McNish-Lincoln 

Subroutine 

(LINCMC) 


Smooth  out 
discontinuity 
Subroutine 
(PCHMAX  or 
PCHMIN) 


Output  files 
(FLXPLOT.DAT  or 
APPLOT.DAT; 
FLXPRED.DAT  or 
APPRED.DAT; 
FLXOUT.DAT  or 
APOUT.DAT; 
FLXCHECK.DAT  or 
APCHECK.DAT) 


END 


Find-Inflection- 

Point- 

Subroutine 

(INFLBS) 


Cubic  Curve  Fit 
Subroutine 
(CUBFIT) 


RETURN 


Calculate  the 
Mean  of  the 
completed 
solar  cycles 


Calculate  the 
Diffference 
between  the 
Actual  and  the 
Mean 


Calculate 
Variance 
Matrix  (A)  and 
Covariance 
Matrix  (B) 


Calculate  the 
Inverse  of  A 
Subroutine 
(GJR) 


Calculate 
Regression 
Coefficient  (C) 


Calculate  the 
Predicted 
values  for  the 
remainder  of 
the  present 
cycle 


I 

Calculate  the 
Percentiles 
Subroutine 
(PRCNTILE) 


RETURN 


Divide  the 
difference  by 
the  standard 
deviation 


Calculate  the 
Quantile 
Subroutine 
(QUANTILE) 


Find-the- 
Percentile- 
Subroutine 
(FNDPRCNT) 
Use  this  twice 
once  for  upper 
bound  and 
second  time 
for  lower 
bound 


RETURN  td 

prcntileJ 


Arrange  the 
data  in 
ascending 
order 

Subroutine 

(RANKIT) 


Calculate  the 
Quantile 
Values 


Determine  the 
location  of  the 
data  for  the 
percentile 
selected 


Linear 

Interpolation 

Subroutine 

(INTERP) 


RETURN  toN 
FNDPRCNT/ 


Figure  4-3.  Modified  McNish-Lincoln  Linear  Regression  Computer  Program  Block  Diagram. 
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4.3  Statistical  Estimation  Output  Report  Computer  Program 

REPORT  combines  the  and  Ap  data,  reads  the  FORECAST  output  files 
(fluxout.dat  and  apoutput.dat),  and  merges  the  data  via  the  merge  subroutine  (MERGE).  The 
memorandum  printout  subroutine  (FRMEMO)  takes  the  data  from  MERGE  and  prints  the  table 
for  the  MSEC  monthly  solar  activity  memorandum.  Figure  4-4  is  a  summary  block  diagram  of 
the  report  program;  appendix  E  is  an  example  of  the  memorandum. 


Figure  4-4.  Report  Program  Block  Diagram. 

Appendix  E  provides  a  sample  of  the  MLLRT  computer  program  output  during  a 
maximum  to  maximum  cycle.  The  present  1 32-month  period  cycle  initialization  month  is  June 

1989  (cycle  22  maximum)  and  the  F,o7  value  is  213.1.  Using  the  statistical  Lagrangian  data 
base  for  the  converted  and  observed  smoothed  solar  flux,  the  estimate  of  the  95,  ~50,  and  5 
percentile  values  for  F,o  7  is  arranged  in  a  monthly  sequence  through  the  balance  of  cycle  22, 
through  cycle  23,  and  into  the  first  half  of  cycle  24.  Obtainable  from  the  95,  ~50,  and  5 
percentile  F,o7  values  for  the  present  cycle  is  an  estimate  of  the  range  for  the  upcoming  solar 
flux  cycle  maximum  values.  The  range  estimate  given  for  the  1 32  months  from  the  maximum 
of  cycle  23  to  the  maximum  of  cycle  24  is  determined  from  the  mean  (-50%)  cycle  and  the 

upper  (95%)  and  lower  (5%)  bounds  based  on  the  converted  and  observed  F,,,,  data  base 
(currently  21  cycles).  The  tie-in  for  the  present  to  next  cycle  uses  a  cubic  connection  from  the 
nearest  inflection  points  on  the  rising  leg  of  cycle  23  to  the  maximum  of  the  95,  -50, 5 
percentile  values  of  the  flux  for  all  cycles.  This  defines  the  maximum  of  cycle  23  and  the 
extension  of  the  estimate  into  the  next  cycle.  Tie-ins  accomplished  with  a  cubic  curve  fit  explain 
the  relatively  smoothed  appearance  of  the  curve  in  this  tie-in  area.  The  memorandum  in 
appendix  E,  presents  the  95  and  5  percentile  values  for  the  past  21  cycles  as  a  matter  of 
reference  relative  ne  mean  (-50%)  period  used  in  the  statistical  estimation  computer  program. 

Appendix  E  discussion  above  also  applies  to  the  Ap  estimates. 
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5.0  Engineering  Use  of  MSEC  Solar  Activity  Statistical  Estimation  Technique 
Products 

MLLRT’s  future  F,,,^  and  A^,  estimates  are  intended  mainly  for  use  as  inputs  to  iq>per 
atmosphere  models,  i.e.,  Marshall  Engineering  Thermosphere  Model,  (NASA  dk-179359  and 
CR-179389),  NASA/MSFC  Global  Reference  Atmosphere  Model,  (NASA  TM-4715),  and 
others.  Figure  5-1,  for  example,  depicts  iiq>uts  related  to  the  MSFC  spacecraft  orbital  lifetime 
model. 


PREDICTED 
SPACECRAFT 
ORBITAL 
LIFETIME 


Figure  5-1.  MSFC  Spacecraft  Orbital  Lifetime  Prediction  Model  Block  Diagram*  '^. 


Since  there  is  no  method  for  intermediate  and  long-term  predictions  of  daily  Fj^  ^  and  a^,, 
orbital  lifetime  models  use  the  ^10.7  and  Ap  estimates^.  Orbital  lifetime  estimates,  control 
analysis  programs,  et  al. ,  require  a  specific  date  to  associate  with  the  future  estimate  of  F^j^  and 

Ap  to  conq>ute  corresponding  atmospheric  density.  Future  estimated  Fjqj  and  Ap  data  points 
should  be  identified  with  the  first  day  of  the  given  month. 

For  spacecraft  projects  requiring  a  minimum  risk  design  lifetime  orbital  altitude(s)  and 
a  specified  control  capability,  the  95  percentile  estimates  of  Fj^,  and  Ap  are  reconunended. 
Taking  into  account  the  short-term  (days)  dynamics,  these  estimates  permit  the  design  of  a 
statistically  conservative  spacecraft  lifetime  and  control  capability  at  a  given  orbital  altitude(s). 
The  lifetime  determination  should  be  based  on  the  most  current  intermediate  and  long-range 
statistical  estimate  of  the  future  solar  flux  and  geomagnetic  index  consistent  wrdi  the  critical 
project  development  decision  time  points  prior  to  plarmed  launch  of  the  spacecraft. 

Changes  in  orbital  density  associated  with  short-term  variations  in  the  daily  Fj^ ,  and  ap 
required  as  iiqruts  to  the  upper  atmospheric  models,  are  not  represented  by  the  Fj^,  and  Ap 
statistical  estimates  given  in  the  MSFC  monthly  solar  activity  memorandum.  Future  changes  in 
total  atmospheric  density  cannot  be  estimated  with  any  acceptable  degree  of  statistical  confidence 
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using  existing  techniques.  Representative  data  sets,  based  on  past  daily  F,o,7  and  ap  values,  may 
be  utilized  to  compute  this  dynamic  component  of  the  orbital  altitude  density. 

The  design  requirements  for  solar  activity  and  associated  on-orbit  density  values  are 
specified  in  the  appropriate  spacecraft  and  space  vehicle  project  design  requirements 
documentation.  Consult  these  documents  for  this  information. 

6.0  Concluding  Remarks 

Because  no  physical  solar  activity  prediction  models  are  accepted  generally_by  the 
scientific  community,  MSFC  developed  the  MLLRT.  MLLRT  provides  Fjqj  and  values  for 
input  parameters  to  the  MSFC  orbitd  altitude  neutral  atmosphere  modek  noted  ir^section  5.0. 
Although  the  atmosphere  models  are  built  on  inputs  of  Fj^,  and  ap,  the  Fj^,  and  Ap  values  are 
the  lowest  level  of  smoothing  with  any  acceptable  level  of  confidence  that  lends  itself  to 
reasonably  accurate  statistical  prediction. 

The  technique  utilized  by  MSFC  is  based  on  a  small  sample  size  and  a  nongaussian 
distribution  of  Fio,  and  Ap.  To  estimate  future  activity  MSFC  uses  a  mean  cycle  and  deviations 
derived  from  previous  cycles,  combined  with  a  one-term  linear  regression,  to  estimate  future 
activity  of  the  present  cycle.  Since  intermediate  term  (months)  estimates  are  more  accurate  than 
long  range  (years)  estimates,  the  data  base  is  initialized  at  both  established  maximum  and 
minimum  values  of  solar  cycle  activity.  Based  on  data  from  previous  cycles,  the  probability  is 

90  percent  that  the  actual  future  Fj^,  or  Ap  values  will  be  within  the  current  MLLRT  computer 
program  output  for  estimated  95  and  5  percentile  values.  The  computer  program  may  be 
adjusted  to  accommodate  any  desired  percentiles  between  95  and  5. 

To  provide  an  assessment  of  the  MLLRT  computer  program’s  products,  appendix  F 
presents  plots  (Figure  F-1  through  F-6)  of  the  Fj^  ^  future  estimates  from  the  MLLRT  for  a  5- 
year  period  from  date  of  estimation  for  several  different  solar  cycles.  Three  of  the  solar  flux 
estimation  periods  are  for  1 1 -year  i^riods  and  began  at  the  minimum  between  cycles  19-20, 20- 
21,  and  21-22  and  three  at  the  maximum  of  cycles  20, 21  and  22. 
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APPENDIX  A 

Linear  Least  Squares  Model  Development 


Robert  L.  Holland 
Computer  Sciences  Corporation 
Huntsville,  Alabama 

From  the  Lagrangian  interpolated  cycles,  there  are  currently  21  complete  cycles  from 
maximum  to  maximum  pe^s.  There  are  132  interpolated  values  in  each  of  these  cycles.  In  the 
following  development  of  the  modified  McNish-Lincoln  linear  regression  method  each  point  in 
the  data  will  be  referred  to  as  Ry  where  i  refers  to  the  i*  point  within  a  cycle  (1  to  132)  and  j 

refers  to  the  cycle  number  (1  to  22).  Other  variables  will  be  defined  as  they  appear. 

The  expected  value  of  R'  at  p-points  beyond  m,  where  m  is  the  number  of  observed 

values  in  the  current  cycle  can  be  written  as  a  linear  combination  of  any  number  k  <  m 

coefficients  times  the  deviations  from  the  means  of  the  k-known  13-month  smoothed  monthly 
values  within  the  current  cycle. 

Holland,  Rhodes,  and  Euler  included  the  time  derivatives  in  the  linear  combination 
model.  In  the  following  development,  the  derivatives  will  be  omitted  since  their  results 
showed  that  including  more  than  one  coefficient  and  the  derivative  did  not  improve  the 
confidence  level  nor  the  goodness  of  fit.  For  k=2,  the  model  requires  2  coefficients  and  2 

independent  variables  and  likewise  for  k  <  m.  The  k  =  2  model  will  be  developed  for 

simplicity  then  specialized  for  k  =  1,  which  is  the  MLLRT  model  in  use  at  MSFC.  The  model 
may  be  presented  classically  as 

z'  =  ax  +  by  (A-1) 

where  x  and  y  are  the  two  independent  variables,  z'  is  the  dependent  variable,  and  a  and  b  are 
the  linearly-related  (regression)  coefficients. 

z’  is  the  model  predicted  value  of  the  actual  z  data  with  N  known  (xj,  yj,  zj) 

corresponding  points.  It  is  required  to  find  the  coefficients  a  and  b  which  best  fit  the  data  in  a 
least  squares  to  minimize  the  sum  of  the  squares  of  the  deviations  of  the  chosen  model  (A-1) 
predictions  from  the  data.  Substituting  the  data  points  Xj,  yj  in  the  right  side  of  equation  (A-1) 

gives  a  model  predicted  Zj,  i.e., 

Zj  =  axj  +  byj,  (A-2) 


A-1 


1  <  j  <  N  (the  number  of  known  data  points). 


Now  it  is  required  to  minimize  the  sum  of  the  squared  differences  (deviations)  of  these 
predicted  from  the  data  zj.  These  deviations  dj  are 

dj  =  zj  -  z'j  =  Zj  -  (axj  +  byj)  (A-3) 

There  are  N  of  these  deviations  for  each  corresponding  point  within  the  132-point 
Lagrangian  interpolated  cycle.  Denoting  dj,  xj,  yj,  zj  and  Zj  as  n-dimensional  vectors  defined 

as 


rd,^ 

^^1  ^ 

^2 

Z2 

Z2 

X2 
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d  = 
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Z  = 

z’  = 
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X  = 

• 

y  = 

• 

<yN> 

then  equation  (A-3)  can  be  written  in  vector-matrix  notation  as 

d  =  z  -  (ax  +  by)  (A-4) 

The  sum  of  the  squares  of  the  deviations  of  the  data  (z)  from  the  model  prediction  (ax  + 
by)  are  obtained  by  taking  the  dot  product  of  d  with  itself.  This  gives 


d^  =  d  •  d  =  [z  -(ax  +  by)]  •  [z  -(ax  +  by)]  (A-5) 

carrying  our  this  dot  product  gives 

d^  =  z  •  z  -  2z  •  (ax  •  by)  +  (ax  •  by)*  (ax  •  by) 

=  z^  -  2az  •  X  -  2bz  •  y  +  (ax  •  by)^ 
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or 


d"^  =  z"'  -  2az  •  X  -  2bz  •  y  +  a^x^  +  2abx  •  y  +  b'^y 


-2„2 


(A-6) 


To  minimize  d^,  the  partial  derivatives  of  d^  with  respect  to  a  and  b  must  be  zero.  Performing 
these  derivatives  gives 


ad^ 

da 


=  -2z  •  X  +  2ax^  +  2bx  •  y  =  0 


(A-7) 


ab 


=  -2z  •  y  +  2by^  +  2ax  •  y  =  0 


Rearranging  terms  and  dropping  the  2's  gives 


(A-8) 


ax^  +bx*y  =  z»x 


(A-9) 


ax  •  y  +  by^  =  z  •  y 


or  in  the  adopted  vector-matrix  notation  this  becomes 
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Thus  from  the  vector-matrix  equation 
V  =  (M)  c 


(A-10) 


(A-11) 


(A-12) 


the  c  coefficient  vector  is  determined  by  multiplying  equation  (A-12)  through  by  the  inverse  of 
(M),  i.e., 


(M)  ^v  =  (M)  ^(M)c  =>  c  =  (M)  ^v  (A-13) 

notice  that  (M)'*  and  v  on  the  right  are  all  known  data. 

For  a  single  independent  variable  x,  then  b=0  and  there  is  no  y  variable.  In  this  special 
one  coefficient  case,  equation  (A-1 1)  becomes 
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Z  •  X 


or 


a  = 


Z  •  X 


(A-14) 


=  ax 
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It  now  remains  to  associate  the  foregoing  with  the  actual  solar  flux  and 

geomagnetic  index  Ap  data  and  the  statistical  method  (MLLRT)  currently  in  use  at  MSFC. 

X  ->  AR„,  AR„  =  R„  -  R„, 

y  ^m-l>  =  Rm-1  ’  ^m-l’ 

-  1  ^ 

and  R=  =  — Y  R.  1  <  i  <  132 

j=i 


fAR.,  ] 

Rm2 

AR„r 

• 

AR„,  - 

m  is  the  most  current  13-month  smoothed  month  number  from  the  beginning  of  the  current 
cycle,  a  ^  c  and  b  — >  o  are  the  coefficients  corresponding  with  c  in  the  MLLRT  model  which 
is  a  one  coefficient  model,  z  is  any  21 -point  data  vector  from  the  132  in  the  cycle,  excluding 
ARjyj  and  ARj^.j.  z  is  the  model  prediction  for  any  of  the  known  points  within  any  of  the 
known  cycles  or  for  the  points  in  the  current  cycle. 

With  the  above  association,  the  MLLRT  model  can  now  be  presented  as 

AR„.p  =  c„.|.  AR„  (A-15) 

where 


^  ^  AR;„  *  AR, 

AR.  •  AR„ 

m  is  the  number  of  known  points  within  the  current  cycle,  p  is  any  point  beyond  m  within  the 
132-point  cycle.  Since  the  motivation  in  the  MLLRT  model  is  for  projecting  beyond  the  N 
known  data  points  and  beyond  m  in  the  current  cycle,  then  p>m  for  the  N+1  current  cycle. 

A4 


From  equation  (A- 15)  the  MLLRT  one  coefficient  model  is 
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m+p,  N+1 


R 


m+p 


^m+p^m,  N+1 


(A-16) 


where 


—  1  ^ 
j=l 


and 


N+1  “ 


Reference 


A-1.  Holland,  R.  L.,  C.  A.  Rhodes,  and  H.  C.  Euler:  Lagrangian  Least  Squares  Prediction  of  Solar 
Activity.  NASA  TM-82462,  George  C.  Marshall  Space  Flight  Center,  Huntsville,  Alabama, 
April  1982. 
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APPENDIX  B 


Cubic  Connection  From  Present  Predicted  Cycle  to  Mean  of  Next  Cycle 


Robert  L.  Holland 
Computer  Sciences  Corporation 
Huntsville,  Alabama 


The  following  steps  describe  the  procedure  for  developing  the  cubic  connection  from 
present  predicted  cycle  to  mean  of  next  cycle: 

1 .  Depending  on  when  the  present  cycle  was  initialized,  select  the  last  inflection  point 
of  the  estimated  (present)  cycle  (tj,  Fj)  near  the  24-month  point  before  the  next  cycle  maximum 
or  minimum.  The  24-month  limit  was  used  to  avoid  a  sudden  rise  or  fall  in  the  transition  gap, 
since  more  th^  one  inflection  point  exist  in  the  prediction  data.  This  should  be  done  with  at 
least  a  five-point  numerical  scheme  for  determining  where  the  second  derivative  goes  to  zero®'* 
One  such  scheme  is  Sterling’s  in  reference®'^ 


1 

^N+2  j  ^^N+l  I  ^^N-l  Fn~2 

[dPj 

N  ■  (At)^ 

.  12  3  2  3  12  . 

(B-1) 


2 .  Select  the  maximum  (or  minimum)  on  the  mean  cycle  (t„,  FJ.  This  is  normally  the 
input,  but  it  can  also  be  determined  numerically. 

3 .  The  second  derivative  should  be  zero  at  (tj,  F,),  and  the  first  derivative  is  zero  at  (t„, 
Fm)-  A  cubic  curve  is  the  lowest  degree  polynomial  which  can  be  determined  that  will  go 
through  the  two  points  and  have  these  properties  at  the  two  points. 

4 .  The  coefficients  of  the  polynomial 

F(t)=  aP  +  bP  +ct  +  d  (B-2 

are  to  be  determined  from  the  following  four  linear  equations  and  will  have  the  required  four 
properties: 


6ati  -f  2b  =  0  { second  derivative  is  zero  at  inflection  point  (B-3) 

3^tm+2bt^+c  =  0  {first  derivative  is  zero  at  maximum  or  minimum  (B-4) 

afj  +  b  tf  +  ct,  +  d  =  F,  (curve  goes  through  the  inflection  point  (tj,,  Fj),  (B-5) 

+  b  t^  +  ct^  +  d  =  F„  (curve  goes  through  the  maximum  (or  minimum) 

point  (t„,  F„)  (B-6) 


There  are  four  linear  equations  in  four  coefficients  which  may  be  written  in  vector/matrix 
notation  and  solve  for  a,  b,  c,  and  d.  Define  vector  F,  matrix  (M)  and  vector  of  coefficients  to 
be  determined  c  as  follows: 
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The  corresponding  vector/matrix  equation  for  equations  (B-3)  through  (B-6)  above  is 
(M)c  =  F 

To  solve  for  c  multiply  the  inverse  (M) '  i.e. 

(M)-'(M)c  =  (M)-'F 
then  c  =  (M)'F. 

Note  that  all  the  elements  of  (M)  '  and  F  are  known  data. 

The  solution  for  c  becomes 
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APPENDIX  C 


Estimates  of  Percentiles  for  the  13-Month  Smoothed  Solar  Flux  Data 


Leonard  W.  Howell,  Jr. 

Electromagnetics  and  Aerospace  Environments  Branch 
Systems  Analysis  and  Integration  Laboratory 
George  C.  Marshall  Space  Flight  Center 
National  Aeronautics  and  Space  Administration 


The  intent  of  the  MLLRT  Model  and  the  periodic  memos  was  to  provide  NASA  programs  with 
an  estimation  of  the  most  probable  solar  activity  trends,  based  on  statistical  analysis  of  the 
historical  record,  with  confidence  bounds  placed  on  that  estimate.  However,  it  has  turned  out 
that  many  of  the  most  important  applications  required  estimates  for  the  high  extreme  values,  and 
so  the  97.7  percentile  envelope  was  used  in  this  application.  For  this  reason  a  re-examination  of 
the  methodology  for  evaluating  the  percentile  estimates  associated  with  these  bounding  values 
was  undertaken  and  some  changes  have  been  made.  This  appendix  summarizes  the  results  of 
this  evaluation  and  the  rationale  for  the  revised  approach. 

Percentiles  are  the  probability,  expressed  as  a  percentage,  that  a  variable  X  will  be  less  than  a 
specified  value  Xp  To  eliminate  repetition  on  the  factors  of  100  in  the  mathematical  expressions 
which  follow,  we  denote  this  probability  as  a  fraction,  or  “quantile”,  rattier  than  a  percentile.  The 
only  difference  between  percentile  and  quantile  is  that  percentile  refers  to  a  percent  of  the  set  of 
data  and  quantile  refers  to  a  fraction  of  the  set  of  data.  For  example,  for  a  standard  normal 
distribution,  1.96  is  the  0.975  quantile  and  is  the  97.5*  percentile  thus  p=  Pr{X  <  Xp}  and  it 
ranges  from  zero  to  one.  The  inverse  function  Q(p)=Xp  is  also  useful  to  yield  the  value  in  the 
range  (FI 0.7  flux,  in  this  case)  associated  with  a  given  probability  (p  value).  The  concept  of  a 
quantile  is  illustrated  in  Figure  C-1  for  an  arbitrary  probability  density  function  f(x)  and  its 
cumulative  distribution  F(x). 
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Figure  C-1.  Arbitrary  Probability  Density  Function  f(x)  and  Its  Cumulative  Distribution  F(x). 


Historically  the  MSFC  solar  activity  memos  have  reported  p  values  of  0.5,  the  median  or  “best 
estimate,”  and  bounding  values  of  0.023  and  0.977.  These  are  easy  to  interpret  for  the  engineering 
community  because  of  their  relationship  to  the  “2-sigma”  probabilities  of  the  Normal 
distribution,  even  though  there  are  strong  indications  that  the  Normal  distribution  is  not 
applicable  to  this  data  set. 

Calculation  of  percentile  probabilities  is  a  well  defined  process  in  cases  were  there  is  a  large  data 
set  or  where,  for  other  reasons,  the  sample  is  known  to  follow  a  specified  frequency  distribution. 
The  probability  is  the  integral  imder  the  frequency  distribution  from  the  lower  limit  of  the 
distribution,  often  -«>,  to  Xp.  For  the  problem  at  hand,  however,  where  the  data  set  consists  of 
only  21  data  points  and  the  appropriate  distribution  is  unknown,  the  probability  can  only  be 
estimated.  One  must  select  one  of  the  various  possible  methods.  Direct  calculation  of  a  precise 
value  which  “truly”  represents  the  situation  is  not  possible. 

For  illustration  we  will  use  the  13-month  smoothed  Lagrangian  flux  data  set  illustrated  in  Figure 
C-2.  The  13-Month  Smoothed  Lagrangian  Solar  Flux  data  is  calculated  from  the  monthly  solar 
flux  data  that  is  first  smoothed  by  the  Zurich  smoothed  equation.  The  13 -month  smoothed  time 
series  data  is  then  distributed  into  groups  of  cycle  data.  The  data  then  is  interpolated  by  a 
Lagrangian  method  into  21  “peak-to-peak”  solar  cycles,  with  each  cycle  consisting  of  132  points 
(months)  as  depicted  in  Figure  C-2.  It  should  be  noted  that  the  data  is  not  really  data  in  the 
usual  statistical  sense  of  “observations”,  but  it  is  actually  the  output  from  multiple  layers  of 
processing  of  the  original  observations.  This  fact  is  not  relevant  to  the  issues  addressed  in  this 
appendix,  but  it  is  important  to  the  physical  interpretation  of  the  solar  model  results.  The 
processing  tends  to  mask  some  of  the  actual  variations  and  properties  of  the  original  data. 
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Figure  C-2.  13-Month  Smoothed  Lagrangian  Flux. 


When  observing  this  data,  it  is  important  to  note  that  two  cycles  appear  to  run  side-by-side  and 
on  the  high  side  from  month  80  to  just  under  month  100,  and  then  again  briefly  around  months 
115-118.  Secondly,  the  mean  and  standard  deviation  trendlines  can  be  constructed  from  the  data 
and  are  shown  in  Figure  C-3. 


Figure  C-3.  Mean  and  Standard  Deviation  Trend  Lines. 


General 


Two  methods  for  estimating  percentiles  are  developed  and  compared.  Both  methods  presented 
make  no  attempt  to  analytically  model  an  underlying  distribution  and  thus  represent  strictly 
empirical  approaches.  The  first  method  discussed  makes  use  of  frequency  histograms  to 
construct  cumulative  frequency  polygons  and  is  similar  to  the  previous  method.  The  second 
method  presented  is  referred  to  as  the  Quantile  method  in  the  statistical  literature  and  will  be 
compared  to  the  cumulative  frequency  polygon  method. 

Method  1.  Cumulative  Frequency  Polygons 

Graphical  techniques  can  be  easily  implemented  and  they  are  very  valuable  in  exploratory  data 
analysis  and  in  combination  with  formal  statistical  tests.  In  the  former  the  objective  is  to 
discover  particular  features  of  the  underlying  distribution  such  as  outliers,  skewness  or  kurtosis 
(i.e.,  thickness  of  the  tails  of  the  distribution).  The  saying  that  a  picture  is  worth  a  thousand 
words  is  especially  appropriate  for  this  type  of  analysis. 

The  frequency  polygon  is  a  many  sided  figure  that  is  often  used  as  an  approximation  of  the 
underlying  probability  density  function  of  a  random  variable  and  is  constructed  from  a  histogram 
of  the  data.  Construction  and  use  of  the  frequency  polygon  is  well  documented  in  the  literature, 
and  the  interested  reader  is  referred  to  Kohler  [C-1],  Downie  and  Heath[C-2],  and  Kendall  and 
Stewart  [C-3].  To  draw  the  frequency  polygon,  the  same  set  of  coordinates  is  used  as  for  the 
histogram,  but  this  time  the  class  mark,  or  midpoint  of  each  class  width,  is  identified  (as  the 
average  of  the  two  class  limits)  and  a  dot  is  positioned  above  it  at  a  height  equal  to  the  relative 
frequency  density.  The  dots  are  then  connected  by  straight  lines.  To  complete  the  polygon, 
straight  lines  are  also  drawn  from  the  dots  above  the  first  and  last  class  marks,  respectively,  to 
points  on  the  horizontal  axis  a  distance  W  below  the  first  class  midpoint  and  a  distance  W  above 
the  last  class  midpoint,  where  W  is  the  class  width. 

This  method  is  illustrated  using  the  21  data  points  defined  by  the  first  month  of  the  21  solar 
cycles.  The  data  in  rank  order  is  given  below: 

S={98.3,  99.9, 107.1, 120.5, 123.2,  125.4,  132.9,  133.2,  145,  152.1, 155,  162.2,  162.3,  177.4, 

185.8, 186.4, 192,  202.2,  203.6,  203.7,  245.1). 

First,  the  sample  mean  is  computed  as  157.8  and  the  standard  deviation  as  39.7;  these  points  are 
visible  as  the  initial  points  (extreme  left  hand  end)  of  the  curves  in  Figure  C-3.  Next,  the 
standardized  data  set  is  constructed  as  Zi=(Xj  -  157.8)/39.7  which  gives 

Sstandardized  =  {-1-498,  -1.458,  -1.276,  -0.939,  -0.871,  -0.816,  -0.627,  -0.619,  -0.322, 

-0.143,  -0.07, 0.111, 0.114, 0.494, 0.706,  0.721,  0.862, 1.119, 1.154, 1.157, 2.2  } 

The  fi-equency  histogram  and  polygon  are  then  constructed  as  illustrated  in  Figure  C-4  for  this 
standardized  data  set. 
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standardized  Data 


Figure  C-4.  Relative  Frequency  Histogram. 


The  cumulative  frequency  polygon  or  “less-than”  ogive  is  then  constructed  directly  from  the 
polygon,  as  shown  in  Figure  C-5.  Quantiles  of  interest  are  then  approximated  using  linear 
interpolation.  More  sophisticated  methods  may  be  applied  by  first  smoothing  the  frequency 
polygon  and  then  using  higher  order  interpolation  methods  if  desired. 
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Figure  C-5.  Frequency  Polygon. 


Quantiles  may  be  obtained  from  Figure  C-5  using  linear  interpolation  which  are  then  transformed 
back  to  the  original  imits  of  the  data  by  multiplying  by  the  standard  deviation  and  then  adding  the 
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mean.  For  example,  Q(0.95)  is  1.996x39.7  +  157.8  =  237,  or,  in  terms  of  percentiles,  237  is  the 
95*'’  percentile. 


Discussion 

When  constructing  the  relative  frequency  polygon,  one  must  first  construct  the  relative  frequency 
histogram.  Constructing  histograms  usually  requires  a  certain  amount  of  personal  judgment.  For 
example,  the  number  of  classes  to  be  used  and  the  corresponding  class  width  is  a  matter  of 
personal  choice.  General  guidelines  recommend  between  five  and  twenty  classes;  a  smaller 
number  sacrifices  too  much  detail,  a  larger  one  retains  too  much  of  it.  Many  statisticians  use 
Sturgess’s  rule  to  determine  the  number  of  classes  k,  where  Sturgess’s  rule  gives  k=H-3.3 
logio(n),  where  n  is  the  sample  size.  Another  rule  found  in  Panofsky  and  Brier  [C-4]  suggests 
k=5  logio(n).  For  the  data  set  under  consideration,  n=21  so  the  two  rules  give  5.5  and  6.6,  and  so 
k=6  is  a  reasonable  number  of  classes  to  begin  with. 

Next,  the  class  width  W  must  be  determined.  Assuming  that  the  observations  have  been  put  in 
rank  order  so  that  Xi  <  X2  < ...  <  Xn ,  then  the  range  of  the  data  is  defined  to  be  R=Xn-Xi,  and  hence 
the  class  width  W  must  be  greater  than  or  equal  to  R/k  in  order  to  include  all  the  data  in  the 
frequency  count.  However,  if  one  simply  sets  W=R/k  then  the  endpoints  of  the  data  (xi  and  Xn) 
will  correspond  to  the  class  limits  of  the  first  and  last  class,  respectively,  which  is  usually  not 
appropriate.  In  fact,  the  general  guidelines  are  that  the  class  midpoints  should  roughly 
correspond  to  the  mean  of  the  data  points  that  fall  within  each  class.  However,  this  in  practice 
can  only  be  achieved  in  some  approximate  sense  so  that  it  is  very  unlikely  that  two  statisticians 
would  ever  arrive  at  the  same  histogram  parameters  for  the  same  data  set. 

This  is  a  very  important  consideration  when  implementing  a  general  purpose  algorithm  for 
estimating  quantiles  and  so  we  next  explore  the  sensitivity  of  quantile  estimation  with  respect  to 
the  class  width,  keeping  the  number  of  classes  fixed  at  k=6  for  this  data  set.  Two  methods  of 
class  width  selection  are  compared. 

The  first  method  selects  W=R/(k-l)  which  forces  the  endpoints  of  the  data  to  be  the  midpoints 
of  the  first  and  last  class.  This  method  is  applied  across  the  132  months,  at  each  month 
constructing  the  associated  cumulative  frequency  polygon  and  then  estimating  Q(0.95)  and 
Q(0.976).  A  second  method  is  chosen  where  the  mean  of  the  data  (which  is  zero  for  the 
standardized  data  set)  corresponds  to  one  of  the  class  midpoints  and  W=(R+l)/k  as 
recommended  in  Downie  and  Heath  [C-2]  which  insures  that  X]  and  X21  do  not  correspond  to  the 
first  and  last  class  limits.  Thus,  with  W  defined,  the  number  of  classes  set  at  k=6,  and  one  class 
midpoint  defined,  all  class  limits  are  automatically  determined  and  the  histogram  and 
corresponding  cumulative  polygon  can  be  constructed  and  the  quantiles  of  interest  estimated. 
Q(0.95)  and  Q(0.976)  are  compared  for  these  two  methods  in  Figures  C-6  and  C-7;  the  monthly 
mean  curve  is  included  in  each  figure  as  a  reference  curve.  Notationally,  the  graph  labeled 
Polygon2  refers  to  the  W=R/(k-l)  method.  Now,  in  both  of  these  figures  we  note  that  while  the 
two  graphs  are  fairly  close  to  each  other,  they  nevertheless  differ  in  a  number  of  places.  Most 
notably,  however,  is  that  each  graph  has  some  unusual  jagged  sections  in  one  region  but  does  not 
necessarily  have  a  jagged  section  in  the  graph  produced  by  the  other  method  in  the  same  region. 
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Figure  C-6.  Comparison  of  Frequency  Polygon  Methods  for  Q(0.95). 
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Figure  C-7.  Comparison  of  Frequency  Polygon  Methods  for  Q(0.976) 


This  clearly  points  out  how  sensitive  quantile  estimation  is  to  the  choice  of  the  class  width  W. 
Thus,  given  that  the  selection  of  W  (and  even  k,  the  number  of  classes)  is  quite  subjective  and 
variations  in  W  produce  different  emd  even  jagged  results  in  Q,  it  is  strongly  recommended  that 
quantiles  NOT  be  estimated  using  frequency  polygons.  However,  this  does  not  detract  from 
their  use  for  other  purposes,  such  as  their  graphical  value. 


Method  2;  Quantile  Plots 

Chambers,  et.  al.  [C-5],  give  a  succinct  discussion  of  using  quantile  plots  and  the  quantile 
function  in  their  book  Graphical  Methods  for  Data  Analysis.  As  with  Method  1 ,  these  methods 
will  be  illustrated  using  the  21  data  points  defined  by  the  first  month  of  the  21  solar  cycles.  The 
actual  data  presented  in  rank  order  is  given  below: 

S={98.3, 99.9, 107.1, 120.5, 123.2, 125.4, 132.9, 133.2, 145, 152.1,  155, 162.2, 162.3,  177.4, 

185.8,  186.4,  192, 202.2, 203.6,  203.7, 245.1}, 

and  recall  that  the  sample  mean  is  157.8  and  the  standard  deviation  is  39.7. 


The  authors  in  [C-5]  next  construct  an  operational  definition  of  quantile  and  give  supporting 
rationale  for  its  functional  form.  Starting  with  a  set  of  rank  ordered  data  given  as  Xj  for  i=l  to  n, 
with  xi  <  X2  <  ...  <  Xn  ,  let  p  represent  any  fraction  between  0  and  1.  Then  the  quantile  Q(p) 
corresponding  to  the  fraction  p  is  defined  to  be  Xj  whenever  p  is  one  of  the  fractions  Pi=(i-0.5)/n, 
for  i=l  to  n. 


Thus,  for  the  21  observations  in  S,  the  following  table  constructed,  where  pj=(i-0.5)/21  for  i=l  to 
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This  may  then  be  presented  as  a  quantile  plot  as  depicted  in  Figure  C-8. 
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Figure  C-8.  Quantile  Plot. 


So  far,  the  quantile  function  is  defined  only  for  certain  discrete  values  of  p,  namely  pj.  If 
necessary,  the  definition  of  Q(p)  is  extended  within  the  range  of  the  data  by  linear  interpolation. 
This  means  connecting  consecutive  points  with  straight  line  segments,  as  show  in  the  Figure  C-9 
below.  Symbolically,  if  p  is  a  fraction  f  of  the  way  from  pi  to  Pi+i,  then  Q(p)  is  defined  as 

Q(p)=(l-f)Q(pO  +  fQ(pi+i) 


This  cannot  be  used  to  define  Q(p)  outside  the  range  of  the  data,  where  p  is  smaller  than  0.5/n  or 
larger  than  (l-.5/n).  As  the  authors  in  [C-5]  state: 

“Extrapolation  is  a  tricky  business;  if  we  must  extrapolate  we  will  play  safe  and  define 
Q(p)=xi  for  p<pi  and  Q(p)=Xn  for  p>pn,  which  produces  the  short  horizontal  segments  at 
the  beginning  and  end  of  the  Figure  C-9.” 
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Figure  C-9.  Interpolated  Quantiles. 


In  practical  terms,  quantiles  smaller  than  0.238  (percentiles  smaller  than  2.38%)  or  larger  than 
0.976  (percentiles  larger  than  97.6%)  are  beyond  the  resolution  of  the  sample  size  under 
consideration  (n=21)  and  cannot  be  derived  from  the  data  without  further  assumptions.  Thus, 
for  this  dataset,  x,=98.3  is  the  0.0238  quantile  and  X2i=245.1  is  the  0.976  quantile.  This  result 
can  be  generalized  to  say  that  for  any  data  set  consisting  of  21  observations,  the  minimum  value 
Xi  is  the  0.0238  quantile  and  the  maximum  value  X21  is  the  0.976  quantile. 

In  order  to  go  beyond  these  limits,  one  approach  we  considered  was  to  learn  if  the  theoretical 
endpoints  of  the  distribution  of  data,  corresponding  to  Q(0)  and  Q(l),  were  known  or  could  be 
estimated  using  physical  laws.  Linear  interpolation  could  then  be  used  to  approximate  the  very 
small  or  very  large  quantiles  of  interest.  Unfortunately  this  approach  did  not  produce  usable 
bounds. 

The  quantile  method  is  applied  to  the  21  solar  cycles  at  each  of  the  132  months  to  give  Q(0.95) 
and  Q(0.976),  i.e.,  the  95*  and  97.6*  percentiles.  These  results  are  depicted  in  Figure  C-10. 
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Note  that  Q(0.95)  and  Q(0.976)  are  roughly  identical  in  the  months  80-95  and  months  115-118 
where  we  previously  noted  that  two  cycles  appear  to  run  side-by-side  and  on  the  high  side.  This 
is  because  Q(0.95)  is  computed  by  interpolating  between  X20  and  X21  which  are  roughly  equal  to 
each  other  in  these  regions.  Furthermore,  Q(0.95)  yields  a  somewhat  smoother  graph  than 
Q(0.976)  as  one  would  expect  because  Q(0.95)  is  determined  by  linear  interpolation  between  X20 
and  X21  (which  is  a  form  of  averaging  and  hence  smoothing),  while  Q(0.976)  is  always  the 
maximum  of  the  data  (X21)  and  is  therefore  more  sensitive. 


The  Quantile  Method  and  Program  Risk 

As  stated  previously.  Chambers,  et  al.[C-5],  define  the  quantile  Q(p)  corresponding  to  the 
fraction  p  to  be  Xi  whenever  p  is  one  of  the  fractions  pi=(i-0.5)/n,  for  i=l  to  n.  This  definition  of 
Q  is  widely  used  and  the  authors  support  their  choice  of  the  functional  form  of  Q  as  follows: 

“Why  do  we  take  pi  to  be  pi=(i-0.5)/n  and  not,  say  i/n  ...  or  several  other  reasonable 
choices?  We  will  mention  only  that  when  we  separate  the  ordered  observations  into 
two  groups  by  splitting  exactly  on  an  observation,  the  use  of  the  function  (i-0.5)/n 
means  that  the  observation  is  covmted  as  being  half  in  the  lower  group  and  half  in  the 
upper  group.” 

This  is  a  very  interesting,  if  not  compelling,  argument  in  favor  of  this  functional  form,  and  would 
usually  be  used  without  further  consideration  in  most  applications,  especially  in  regions  away 
from  the  endpoints  of  the  data.  However,  it  is  important  to  look  at  the  application  of  these 
results  to  NASA  programs  and  the  inherent  program  risks  associated  with  their  use.  For 
example,  eis  discussed  in  the  preceding  p2iragraphs,  this  functioned  form  for  Q  suggests  that  if  we 
select  the  largest  observation  from  a  sample  of  size  21,  then  we  define  it  to  be  the  0.976  quantile, 
or  in  programmatic  terms,  we  are  stating  that  the  risk  is  only  2.4%  and  this  result  is  based  on  a 
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sample  of  size  21,  without  any  probability  distribution  assumptions!  Because  significant 
program  design  and  planning  decisions  may  potentially  be  based  on  these  numbers,  a  deeper 
investigation  of  their  properties  is  appropriate. 

First,  note  that  the  quantile  function  described  above  is  associated  with  a  special  case  of  the  so- 
called  empirical  distribution  function  (EDF),  which  is  a  step  function  generally  defined  by 

‘'1\,  forOScSl  (C-1) 

n-2c  +  l 


for  the  ordered  observations  Xj  <  X2 ...  <  Xn .  Figure  C-1 1  illustrates  the  inverse  relationship 
between  the  quantile  function,  Q,  and  the  empirical  distribution  function,  Fn,  where  pi  is  defined 
to  be  pi=(i-c)/(n-2c+l). 


Relationship  Between  the  Empirical  Distribution  Function,  F„, 
and  the  Quantile  Function,  Q,:  Q  is  the  Inverse  of  F„ 
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Figure  C-1 1.  Relationship  Between  the  Empirical  Distribution  Function,  Fn,  and  the  Quantile 

Function,  Q. 


The  objective  at  this  point  is  to  select  a  value  of  c  that  makes  the  most  sense  for  our  application. 
However,  it  must  be  understood  that  whatever  methods  employed  to  select  c  must  be  quite 
general  in  that  they  are  distribution  free,  i.e.,  there  will  be  no  probability  distribution 
assumptions  made.  This  is  necessary  because  the  underlying  probability  distribution  of  the  solar 
flux  under  consideration  is  obviously  not  known;  otherwise  the  quantiles  would  also  be  known 
exactly  and  there  would  be  no  reason  for  analysis.  With  this  in  mind,  a  branch  of  statistics 
referred  to  as  “order  statistics”  will  be  used. 
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For  a  sample  of  n  values  Xi,  X2,  Xn  with  a  probability  density  function  f(x)  and  cumulative 
distribution  F(x),  and  where  the  observations  have  been  arranged  in  rank  order  so  that  Xj  <  X2  ... 
<  x„  ,  then  the  probability  density  function  of  the  i*’’  order  statistic  Xj  is  given  by  the  following 
equation,  as  derived  in  Kendall  and  Stewart  [C-3], 


Thus,  for  any  distribution  with  density  function  f(x)  and  cumulative  distribution  F(x),  if  a 
random  sample  of  size  n  is  selected  from  this  distribution  and  the  observations  are  arranged  in 
increasing  order,  the  above  equation  defines  the  probability  density  function  of  the  i*  order 
statistic. 

For  example,  the  probability  density  function  of  the  minimum  or  smallest  observation  is  obtained 
by  setting  i=l  in  Equation  C-2,  giving  g(x,)  =  n{l-F(x,)}"”'f(x,).  Similarly,  the  probability 
distribution  of  the  maximum  or  largest  observation  is  obtained  by  setting  i=n  in  Equation  C-2, 
giving  g(x„)  =  n{F(x„)}"~*f(x„).  Now,  in  the  solar  flux  study,  nature  (and  processing  of  the 

data)  provides  a  random  sample  of  size  n  (n=21  at  present)  for  each  of  132  months,  and  then 
they  are  arranged  in  rank  order  to  construct  the  quantile  plot.  Since  a  linear  combination  of  these 
ranked  observations  is  to  be  used  to  provide  a  quantile  associated  with  a  particular  p-value,  the 
distribution  of  the  p-values  that  are  determined  by  a  linear  combination  of  the  order  statistics 
shall  be  investigated. 

That  is,  if  a  linear  combination  of  the  ranked  observations  is  used  to  estimate  a  particular  quantile 
Xp  of  interest,  the  question  “what  is  the  distribution  of  p-values  provided  by  this  linear 
combination  of  the  ordered  observations?”  must  be  answered.  For  example  the  maximum 
observation  X21  is  begin  considered  to  estimate  a  p-value  in  the  neighborhood  of  0.97.  Therefore, 
the  following  question  is  to  be  answered:  what  is  the  distribution  of  p-values  obtained  when  X21 
is  used  to  estimate  the  percentile  of  interest? 

Imagine  that  as  nature  generates  or  picks  the  21  observations  from  the  unknown  distribution,  the 
maximum  value  X21  is  selected  and  the  question  is  “what  is  the  p-value  associated  with  this 
particular  X21?”  The  answer  is  F(x2i),  i.e.,  the  area  under  the  density  function  f(x)  curve  and  to 
the  left  of  X21  as  illustrated  in  Figure  1 .  Repeating  this  sampling  process  of  selecting  another  set 
of  21  observations  and  picking  the  maximum  value  X21  of  each  of  these  sets  of  21  observations, 
one  would  like  to  know  the  distribution  of  the  p-value  associated  with  this  set  of  X2i’s.  That  is, 
what  is  the  probability  distribution  of  P2i=F(x2i)  and  more  generally,  the  distribution  of  pi=F(Xi) 
for  each  of  the  order  statistics? 

First,  note  that  the  p’s  and  the  x’s  are  in  the  same  order  since  p  is  a  non-decreasing  function  of  x, 
and  furthermore  0  <  pi  <'p2  ...  <  Pn  <  1  since  F  is  bounded  by  0  and  1.  As  discussed  above,  the 
objective  at  this  point  is  to  discover  the  probability  distribution  of  the  order  statistics  pi  .  To 
help  answer  this  question,  a  famous  result  in  probability  theory,  often  called  the  Probability 
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Integral  Transformation  p=F(x),  is  used.  This  theorem  proves  that  the  distribution  of  p  is  the 
standard  imiform  distribution,  independent  of  the  distribution  F(x).  That  is,  f(p)=l  for  0  <  p  <1 
and  F(p)=p.  In  fact,  this  is  the  imderlying  theory  behind  random  number  generation  on  digital 
computers,  which  is  often  use  in  Monte  Carlo  simulations.  The  computer  generates  a  standard 
uniform  random  number  u  between  0  and  1  and  then  solves  the  equation  x=F''(u)  to  generate 
random  numbers  from  any  specified  distribution  F(x). 

Inserting  this  result  in  the  general  formula  for  the  distribution  of  the  i*  order  statistic  given  in 
Equation  C-2,  the  probability  density  fimction  of  p;  is  obtained  and  is  seen  to  be  Beta 
distribution; 

j  •  •  1.  i  j  2  i(n-i+l) 

with  mean  and  vanance  given  by  ii= -  and  a  = - - - . 

^  n+1  (n  +  l)'(n  +  2) 


This  is  a  very  important  result  because  if  c  is  set  to  0,  i.e.,  c=0  in  Equation  (C-1),  then 

Fn(Xi)  =  — for  i  =  l,2,-  -,n  (C-4) 

which  is  identical  to  the  mean  of  the  distribution  of  Pi  and  explains  why  c=0  is  called  the  “mean” 
plotting  position.  For  example,  suppose  for  the  solar  sample  of  size  21  the  largest  observed 
value  X21  is  used  as  the  quantile.  The  question  is,  what  is  P21?  By  the  first  definition  of  Q  it 
would  be  P2i=Q(x2i)=(21-.5)/21=0.976.  However,  if  Q  is  defined  as  Q(Xi)=i/(n+l),  then  the 
definition  of  a  quantile  is  forced  to  coincide  with  the  mean  of  the  distribution  of  p;,  so  that 
P2i=21/22=0.954.  To  see  which  definition  makes  the  most  sense  for  the  present  application,  the 
distribution  of  P21  may  be  analyzed  by  setting  i=n=21  in  the  Beta  distribution  above,  getting 

f(P2,)  =  21p^:,  0<p<l  (C-5) 

As  noted  above,  the  mean  is  given  by  21/22=0.954,  and  the  median  is  given  by  ’■/F  =0.968. 
Furthermore,  a  90%  tolerance  interval  for  P21  is  determined  in  a  similar  manner  and  is  given  by 
{0.867,  0.998}  with  a  midpoint  of  0.932.  In  comparing  the  above  two  definitions  for  Q,  the 
choice  (i-0.5)/n  gives  a  result  closer  to  the  median  of  the  distribution,  but  the  choice  of  i/(n+l) 
yields  the  more  reasonable  result,  in  that  it  exactly  matches  the  mean  of  the  distribution  and  is 
also  more  central  to  the  midpoint  of  the  interval.  A  similar  argument  by  symmetry  holds  for  the 
minimum  order  statistic  X|  and  its  associated  pi. 

For  quantiles  not  corresponding  to  X]  and  X21,  linear  interpolation  should  be  used  as  suggested  by 
Chambers,  et.  al.,[C-5]  between  the  appropriate  values,  and  the  quantile  function  will  be  defined 
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as  Q(Xi)=i/(n+l).  Furthermore,  the  Beta  distribution  can  then  be  used  to  give  the  exact 
distribution  of  the  interpolated  or  “smoothed”  p  -values.  As  we’ve  seen  above,  the  means  will 
always  match. 

The  interested  reader  may  find  a  computer  simulation  helpful  in  visualizing  these  results.  A 
general  procedure  for  conducting  a  computer  simulation  is  outlined  below  and  an  example  is 
provided: 

Step  1:  Generate  21  random  numbers  from  any  specified  probability  distribution.  For  example, 
generate  21  numbers  from  a  standard  normal  distribution,  where  the  random  number  generator 
used  in  this  computer  program  generates  a  single  standard  normal  distribution  by  summing  12 
uniformly  distributed  random  numbers  and  subtracting  6.  This  insures  that  the  p-values  obtained 
in  Step  2  are  not  the  same  uniform  random  numbers  used  to  generate  the  random  sample  in  the 
first  place. 

Step  2:  Order  the  21  random  numbers  in  ascending  order.  Then  for  each  random  number, 
compute  its  p-value,  i.e.,  the  area  under  the  normal  distribution  and  to  the  left  of  it.  Store  these 
results  as  Run  1  as  in  Table  C-1. 

Step  3:  Repeat  steps  1  and  2.  This  is  done  5000  times  in  this  example. 

Step  4;  Analyze  the  results.  Relative  frequency  histograms  of  the  p,,  p,j,  and  pj,  data  in  Table 
C-1  are  constructed  and  compared  to  the  Beta  distribution  in  Equation  C-3.  These  results  are 
illustrated  in  Figures  C-12,  C-13,  and  C-14,  where  the  Beta  distribution  is  the  smoothed  curve 
shown  with  the  histogram,  showing  excellent  agreement  between  simulation  results  and  the 
theoretical  model.  Use  Table  C-2  to  compare  between  the  means  and  variances  of  the  simulated 
results  for  p,,  p,,,  and  pj,  and  corresponding  means  and  variances  of  the  Beta  distribution  given  in 
Equation  C-3. 
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Run  Table  C-1 .  Simulated  Data  and  Sample  Means  and  Variances 
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Figure  C-12.  Relative  Frequency  Histogram  and  Probability  Density  Function  of  pj. 


Figure  C-13.  Relative  Frequency  Histogram  and  Probability  Density  Function  for  pn. 
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Figure  C-14  Relative  Frequency  Histogram  and  Probability  Density  Function  for  p2\- 


Continuing  with  a  discussion  of  program  risks,  it  is  standard  practice  to  talk  about  the  Mean  Time 
Between  Failures  (or  the  occurrence  of  some  disastrous  event),  and  hence  using  the  mean  i/(n+l) 
equates  correctly  to  this  terminology.  A  similar  argument  in  favor  of  i/(n+l)  versus  (i-0.5)/n  by  E. 
J.  Gumbel  [C-6]  states  the  following  about  the  method  of  using  (i-0.5)/n  in  a  discussion  of  return 
periods: 

the  method  “  claims  that  an  event  which  has  already  happened  once  in  n  years  will 
occur,  in  the  mean,  in  2n  years.  If  the  extreme  observation  has  economic  consequences,  as 
in  the  case  of  floods,  the  danger  factor  is  heavily  under-estimated.  The  compromise  is 
misleading  where  the  plotting  problem  is  of  most  interest.” 

He  continues  on  with  his  discussion  of  return  periods  of  extreme  events  and  concludes  that  i/(n+l) 
is  the  most  appropriate  to  use  for  these  applications. 

In  conclusion,  it  is  recommended  that  the  following  methods  be  used  when  using  the  quantile 
method  for  the  present  application: 

(1)  Use  the  approach  outlined  by  Chambers  [C-5],  but  set  Q(Pi)=Xi  when  pi=i/(n-f-l) 
instead  of  (i-0.5)/n 

(2)  use  the  linear  interpolation  method  when  necessary  as  previously  discussed 

(3)  recognize  that  percentiles  smaller  than  4.5%  or  larger  than  95.4%  are  beyond  the 
resolution  of  the  sample  size  under  consideration  (n=21)  and  cannot  be  derived  from  the 
data  without  further  assumptions. 

(4)  use  Q(0.95)  instead  of  Q(0.954)  because  it  is  more  common  and  will  be  smoother  as  a 
result  of  interpolation,  as  previously  discussed. 

Using  these  methods  for  the  data  set  under  consideration,  Q(0.95)=240.96.  Figure  C-15  shows 
Q(0.95)  with  these  recommendations  implemented.  The  mean  is  provided  as  a  reference  curve. 
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Figure  C-15.  Quantile  Method:  Q(0.95) 


Recommendations 

Two  methods  of  estimating  percentiles  are  investigated  in  this  report.  The  quantile  method  and  the 
relative  frequency  polygon  method  make  no  attempt  to  analytically  model  an  underlying 
distribution  and  thus  represent  strictly  empirical  approaches.  Of  these  two  methods,  the  relative 
frequency  polygon  method  is  not  recommended  .for  use  in  estimating  percentiles  because  of  the 
sensitivity  of  Q(p)  to  the  class  width  and  resulting  jagged  edges  in  Q. 

Therefore,  it  is  recommended  that  the  Quantile  method  be  used  to  estimate  percentiles  as  long  as 
they  are  within  the  resolution  of  the  sample  size.  As  stated  earlier,  this  means  that  percentiles 
smaller  thp  4.5%  and  larger  than  95.4%  are  beyond  the  resolution  of  the  sample  size  under 
consideration  (n=21)  and  cannot  be  derived  from  the  data.  Furthermore,  we  learned  that  Q(0.95) 
yields  a  somewhat  smoother  graph  because  of  the  linear  interpolation  between  X20  and  x,,  while 
Q(0.954)  is  always  the  m^imum  of  the  data  (Xj,)  and  is  therefore  more  sensitive.  Therefore,  it  is 
recommended  that  the  95*  percentile  be  estimated  in  the  present  application  when  using  the  quantile 
method. 
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APPENDIX  D 


Modified  McNish-Lincoln  and  Quantile  Subroutine  Examples 


SUBROUTINE  LINCMC  (R,N,NU2) 

INTEGER*2  P, Q, IDG, K, I, J, M,MUPRIM,MU2P1,MU2 ,MU1 , MU, NUl , NU2 , 12 , NU,N 

REAL*8  UPPER_PERCENTILE,LOWER_PERCENTILE,R(132,25) ,RMEAN{264) , 
1RPRED(264) ,ERRUP(264) ,ERRDN(264) ,DR(132,25) ,A(20,20) ,AINV(20,20) , 
2APRIME(100) ,B{132,25) ,C(132,25) , JDUM(20) ,DETA(2) ,SIGM(132) , SIGMA 

COMMON  /BLKl/  12 
COMMON  /IDG/  IDG 

COMMON  /ANS/  RMEAN, RPRED, ERRUP, ERRDN, SIGM 
COMMON  /COUNT/  NU,NU1,MU,MU1,MU2,MU2P1,M 
COMMON  /PRCNT/  UPPER_PERCENTILE, LOWER_PERCENTILE 

EQUIVALENCE  (A{1,1) ,AINV(1,1) ,APRIME(1) ) 


MU1=MU2-MU+1 

NU=NU2-NU1+1 

MUPRIM=M-MU2 


IF  (IDG.EQ.1.0R.IDG.EQ.2)  THEN 


WRITE  (44,400) 
WRITE  (44,410) 
WRITE  (44,410) 
WRITE  (44,420) 

WRITE  (44,410) 


'YOU  ARE  IN  LINCMC  SUBROUTINE' 

'M=  ',M, 'N=  ',N, 'NU1=  ',NU1 
'NU2=  ',NU2,'MU=  ' ,MU, 'MU2=  ',MU2 
' UPPER_PERCENTILE=  ' , UPPER_PERCENTILE , 

' LOWER_PERCENTILE=  ' , LOWER_PERCENTILE 
'MU1=  ',MU1,'NU=  ’ ,NU, 'MUPRIM=  ' ,MUPRIM 


DO  110  J=NU1,N 

WRITE  (44,440)  'R(',J, ')' 

WRITE  (44,490)  (R(I, J) , I=1,M) 

110  CONTINUE 

WRITE  (44,400)  'GOING  INTO  DO  LOOP  130' 
END  IF 


DO  130  1=1, M 
RMEAN(I)=0. 

DO  120  J=NU1,NU2 

RMEAN(I)=RMEAN(I)+R(I, J) 
120  CONTINUE 

RMEAN ( I ) =RMEAN ( I ) /NU 
130  CONTINUE 


IF  (IDG.EQ.1.0R.IDG.EQ.2)  THEN 

WRITE  (44,400)  'FORMAT  OF  WRITE  IS  I, RMEAN' 
WRITE  (44,460)  (I,RMEAN(I) , I=1,M) 

WRITE  (44,400)  'GOING  INTO  DO  LOOP  150' 

END  IF 
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DO  150  1=1, M 

DO  140  J=NU1,N 

DR(I,  J)=R(I,  J)-RMEAN(I) 

140  CONTINUE 

150  CONTINUE 

IF  (IDG.EQ.1.0R.IDG.EQ.2)  THEN 
DO  160  J=NU1,N 

WRITE  (44,440)  'DR(',J, ')' 

WRITE  (44,490)  (DR(I, J) , I=1,M) 

160  CONTINUE 

WRITE  (44,400)  'GOING  INTO  DO  LOOP  190' 

END  IF 

DO  190  K=1,MU 
DO  180  Q=1,MU 
A(K,Q)=0. 

DO  170  J=NU1,NU2 

A(K,Q)=A(K,Q)+DR(MU1+K-1, J) *DR (MUl+Q-1 , J) 
170  CONTINUE 

180  CONTINUE 

190  CONTINUE 

IF  (IDG.EQ.1.0R.IDG.EQ.2)  THEN 
DO  200  Q=1,MU 

WRITE  (44,440)  'A{',Q, ')' 

WRITE  (44,510)  (A(K,Q) ,K=1,MU) 

200  CONTINUE 

WRITE  (44,400)  'GOING  INTO  DO  LOOP  230' 

END  IF 

DO  230  P=1,MUPRIM 
DO  220  Q=1,MU 
B(P,Q)=0. 

DO  210  J=NU1,NU2 

B(P,Q)=B{P,Q) +DR(MU2+P, J) *DR(MU1+Q-1 , J) 
210  CONTINUE 

220  CONTINUE 

230  CONTINUE 

IF  (IDG.EQ.1.0R.IDG.EQ.2)  THEN 
DO  240  Q=1,MU 

WRITE  (44,440)  'B( ' ,Q, ' ) ' 

WRITE  (44,510)  (B (P,Q) , P=1,MUPRIM) 

240  CONTINUE 

WRITE  (44,400)  'GOING  INTO  DO  LOOP  260' 

END  IF 


K=0 


DO  260  J=1,MU 
DO  250  1=1, MU 
K=K+1 

APRIME(K)=A(I, J) 
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250  CONTINUE 

260  CONTINUE 

IF  (IDG.EQ.1.0R.IDG.EQ.2)  THEN 

WRITE  (44,400)  'FORMAT  OF  WRITE  IS  K,APRIME' 
WRITE  (44,460)  (I , APRIME (I ) , 1=1 , K) 

END  IF 

DETA(1)=3. 

DETA(2)=0. 

CALL  GJR  (A,MU,MU,MU,MU,*390, JDUM,DETA) 

IF  (DETA(l) .EQ.O. )  GO  TO  390 

K=MU*MU+1 

DO  280  J=MU,1,-1 
DO  270  I=MU,1,-1 
K=K-1 

AINV ( I , J) = APRIME (K) 

270  CONTINUE 

280  CONTINUE 

IF  (IDG.EQ.1.0R.IDG.EQ.2)  THEN 
DO  290  J=1,MU 

WRITE  (44,450)  ' AINV( ’ , J, ' ) ' 

WRITE  (44,500)  (AINV(P, J) , P=1,MU) 

290  CONTINUE 

WRITE  (44,400)  'GOING  INTO  DO  LOOP  320' 

END  IF 

DO  320  P=1,MUPRIM 
DO  310  K=1,MU 
C(P,K)=0. 

DO  300  Q=1,MU 

C(P,K)=C(P.K)+B(P,Q) *AINV(Q,K) 

300  CONTINUE 

310  CONTINUE 

320  CONTINUE 

IF  (IDG.EQ.1.0R.IDG.EQ.2)  THEN 
DO  330  Q=1,MU 

WRITE  (44,440)  'C( ' ,Q, ' ) ' 

WRITE  (44,500)  (C(P,Q) , P=1,MUPRIM) 

330  CONTINUE 

WRITE  (44,400)  'GOING  INTO  DO  LOOP  360' 

END  IF 

DO  360  P=1,M 

IF  (P.GT.MU2)  GO  TO  340 
RPRED ( P) =R ( P, NU2+1 ) 

GO  TO  360 

340  RPRED (P)=0. 
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DO  350  K=1,MU 

RPRED(P) =RPRED(P) +C (P-MU2 , K) *DR (MUl+K-1 , NU2+1 ) 

350  CONTINUE 

RPRED ( P ) =RPRED ( P ) +RMEAN ( P ) 

360  CONTINUE 

IF  (IDG.EQ.1.0R.IDG.EQ.2)  THEN 

WRITE  (44,400)  ‘FORMAT  OF  WRITE  IS  I, RPRED' 

WRITE  (44,460)  (I,RPRED(I) , I=1,M) 

END  IF 

MU2P1=MU2+1 

IF  (I1DG.EQ.1.0R.IDG.EQ.2)  THEN 
WRITE  (44,430)  ■MU2Pl=  ',MU2P1 
END  IF 

j^************************************************************ 

C  WHEN  12=3  THE  SUBROUTINE  BYPASSES  THE  SIGMA  TO  CALCULATE 
C  95%  AND  5%  USING  EMPERICAL  STATISTICS  WITH  QUANTILE  FUNCTION. 

C************************************************************ 

C 


IF  (I2.EQ.3)  THEN 

CALL  PRCNTILE  (DR,C,NU2) 


IF  (IDG.EQ.l. 
WRITE  (44, 
WRITE  (44, 
WRITE  (44, 


WRITE 
WRITE 
WRITE 
WRITE 
WRITE 
WRITE 
WRITE 
WRITE 
WRITE 
WRITE 
END  IF 


OR. IDG. EQ. 2)  THEN 
410)  ■M=  ■,M, ■N=  •,N, ’NU1=  ',NU1 

410)  'NU2=  ',NU2,’MU=  ',MU, 'MU2=  ',MU2 

420)  ' UPPER_PERCENTILE=  ' , UPPER_PERCENTILE , 

• LOWER_PERCENTILE=  ' , LOWER_PERCENTILE 
(44,410)  'MU1=  ',MU1,’NU=  ' , NU , ' MUPRIM=  ',MUPRIM 

(44,400)  'FORMAT  OF  WRITE  IS  I,RMEAN' 

(44,460)  (I,RMEAN(I) ,I=1,2*M) 

(44,400)  'FORMAT  OF  WRITE  IS  I, RPRED' 

(44,460)  (I,RPRED(I) ,I=1,2*M) 

(44,400)  'FORMAT  OF  WRITE  IS  I,ERRUP' 

(44,460)  (I,ERRUP(I) ,I=1,2*M) 

(44,400)  'FORMAT  OF  WRITE  IS  I,ERRDN' 

(44,460)  (I,ERRDN(I) ,I=1,2*M) 

(44,400)  'YOU  ARE  EXITING  LINCMC  SUBROUTINE' 


C 

C 

C 


******************************************************** 
*  RETURN  STATEMENT  TO  SEND  BACK  TO  MAIN  * 

******************************************************** 


RETURN 

ELSE 

IF  (IDG.EQ.1.0R.IDG.EQ.2)  THEN 

WRITE  (44,400)  'GOING  INTO  DO  LOOP  380' 
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WRITE  (44,400)  'FORMAT  OF  WRITE  IS  P, SIGMA' 

END  IF 

DO  380  P=MU2P1,M 
SIGMA=0 . 

DO  370  J=NU1,NU2 

SIGMA=SIGMA+(R(P, J)-RPRED(P) ) **2 
370  CONTINUE 

SIGMA=SQRT (SIGMA/ (NU-MU+1) ) 

ERRUP ( P ) =RPRED ( P ) +2 . *  SIGMA 
ERRDN ( P ) =RPRED ( P ) -2 . *  SIGMA 
IF  (IDG.EQ.1.0R.IDG.EQ.2)  THEN 
WRITE  (44,480)  P, SIGMA 
END  IF 

IF  (ERRDN(P) .LT.O. )  ERRDN(P)=0. 

380  CONTINUE 

IF  (IDG.EQ.1.0R.IDG.EQ.2)  THEN 

WRITE  (44,470)  (P, ERRUP (P) , ERRDN (P) , P=MU2P1 , M) 

WRITE  (44,400)  'IF  12  NOT  EQUAL  TO  3  RETURN  TO  MAIN' 
END  IF 

END  IF 


RETURN 


390 

WRITE 

STOP 

(6,*)  'MATRIX  WAS  SINGLR' 

400 

FORMAT 

(A50) 

410 

FORMAT 

(2X,A7,I4,1X,A7,I4,1X,A7,I4) 

420 

FORMAT 

(2X,A18,F5.2, 1X,A18,F5.2) 

430 

FORMAT 

(2X,A7, 15) 

440 

FORMAT 

(2X,A3,I3,A2) 

450 

FORMAT 

(2X,A6, I3,A2) 

460 

FORMAT 

(6(1X,I3,1X,F8.2) ) 

470 

FORMAT 

(3(1X,I3,1X,F8.2,1X,F8.2) ) 

480 

FORMAT 

(2X,I4,1X,F8.2) 

490 

FORMAT 

(12F6.1) 

500 

FORMAT 

(5(1X,F9.6)) 

510 

FORMAT 

(8(1X,F8.2) ) 

END 


SUBROUTINE  QUANTILE  (DDR,NU2,P) 

C - 

C  THIS  SUBROUTINE  RANKS  THE  DATA  IN  ASCENDING  ORDER  THEN  ALLOCATES 
C  THE  QUANTILE  DISTRIBUTION  FOR  THE  DATA  GIVEN. 

C - 

C 

C - 

C  THE  DATA  IS  DIMENSIONED  STARTING  AT  ZERO  WHEN  THE  DIMENSION 
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C  STATEMENT  HAS  0:  IN  IT. 

C - 


INTEGER*2  NU2,P,K, IDG, I , M, MU2P1 , MU2 , MUl , MU, NUl , NU, 12 
REAL*8  DDR(50) ,RNKANS(50) ,C,QUANT(50) 

COMMON  /BLKl/  12 

COMMON  /IDG/  IDG 

COMMON  /ANS2/  RNKANS, QUANT 

COMMON  /COUNT/  NU,NU1,MU,MU1,MU2,MU2P1,M 

IF { IDG . EQ . 1 . OR . IDG . EQ . 2 ) THEN 

WRITE  (44,120)  ■NU2=  ■,NU2 

WRITE  (44,110)  'WRITE  FORMAT  IS  I, DDR' 

WRITE  (44,140)  (I,DDR(I) , I=1,NU2) 

WRITE  (44,110)  'CALL  RANKIT' 

END  IF 

C - 

C  CALLS  RANKIT  SUBROUTINE  TO  SET  THE  DATA  IN  ASCENDING  ORDER 

C - 


CALL  RANKIT  ( DDR , NU2 , RNKANS ) 

IF(IDG.EQ.1.0R.IDG.EQ.2)THEN 

WRITE  (44,110)  'WRITE  FORMAT  IS  I, RNKANS 

WRITE  (44,140)  (I , RNKANS ( I ) , 1=1 , NU2 ) 

END  IF 


C - 

C  CALCULATE  THE  QUANTILE  FUNCTION  SEE  APPENDIC  C  OF  TM  TBD  FOR 
C  EQUATION. 

C - 


DO  1=1, NU2 

C  QUANT(I)=(FLOAT(I)-.5)/(FLOAT(NU2) ) 

QUANT ( I ) =FLOAT ( I ) / (FLOAT (NU2 ) +1 . 0 ) 
END  DO 


C - WRITE  STAEMENTS  FOR  TROUBLE  SHOOTING  PROGRAM 


C 


IF  (IDG.EQ.1.0R.IDG.EQ.2)  THEN 


WRITE  (44,120) 
WRITE  (44,110) 
WRITE  (44,130) 
WRITE(44, 6000) 
END  IF 


' P=  '  ,  P 

'FORMAT  OF  WRITE  IS  K , RNKANS , QUANT ' 
(K, RNKANS (K) , QUANT (K) ,K=1,NU2) 

'CALL  FNDLAMBDA' 


RETURN 

C - FORMAT  STATEMENTS 


110 

FORMAT 

(A50) 

120 

FORMAT 

(2X,A7, 15) 

130 

FORMAT 

(1X,I3,1X,F8.2,F8.2) 
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140  FORMAT  ( 6 (IX, 13 , IX, F8 . 2 ) ) 
END 
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APPENDIX  E 


MSEC  Monthly  Solar  Activity  Memorandum  Example 


EL23  (Example) 


TO;  Distribution 

FROM:  EL23/Chief,  Electromagnetics  and  Aerospace  Environments  Branch 

SUBJECT :  Solar  Activity  Inputs  for  Upper  Atmospheric  Models  Used  in  Programs 
to  Estimate  Spacecraft  Orbit^  Lifetime 


The  solar  activity  information  in  this  memorandum  is  provided  as  input  data  for  upper 
atmospheric  models  to  insure  compatibility  between  calculations  made  for  spacecraft  orbital 
lifetime  predictions.  The  Marshall  Engineering  Thermosphere  Model  (NASA  CR- 179359 
and  CR-179389),  as  well  as  the  NASA/MSFC  Global  Reference  Atmospheric  Model-1995 
Version  (NASA  TM-4715),  require  inputs  of  the  10.7-cm  solar  flux  (Fjo,)  and  the 
geomagnetic  index  (\)  to  compute  atmospheric  density.  Statistical  estimates  are  provided 
for  the  future  13-montn  smoothed  values  of  these  parameters. 

To  provide  a  better  statistical  estimate  of  the  forthcoming  solar  minimum  and  the  temporal 
profile  for  cycle  23,  the  MSFC  linear  regression  program  has  been  revised.  The  MSFC 
linear  regression  program  is  based  on  the  Lagrangian  least-squares  statistical  technique. 
This  t^hnique  is  described  in  the  papers  “Lagrangian  Least-Squares  Prediction  of  Solar 

Flux  (Fjo,)”,  Journal  of  Geophysical  Research,  Vol.  89,  Number  Al,  Pages  11  through 
16,  January  1, 1984.  A  more  detailed  explanation  of  the  MSFC  linear  regression  computer 
program  and  modifications  that  took  place  in  1995  and  1996  are  in  the  paper  “Statistical 
Technique  for  Intermediate  and  Long-Range  Estimation  of  13-Month  Smoothed  Solar  Flux 
and  Geomagnetic  Index”  (NASA  TM-TBD)  which  is  being  processed  for  publication. 

MSFC’s  linear  regression  program  uses  the  13-month  smoothed  10.7-cm  solar  flux  (Fk,,) 
cycles  1  through  21_converted  and  observed  data  base  and  the  13-month  smoothed 

geomagnetic  index  (Ap)  cycles  13  through  21  converted  and  observed  data  base.  The 
program  estimates  the  intermediate-term  (months)  and  long-term  (years)  behavior  of  the  13- 

month  smoothed  solar  flux  (Fjo^)  and  13-month  smoothed  geomagnetic  index  ( Ap),  up  to 
132  months  into  the  future,  initialized  from  the  cycle  22  June  1989  maximum.  Once  cycle 
23  has  been  established  from  observed  solar  activity  data,  the  solar  flux  program  and 
geomagnetic  index  program  will  be  re-initialized  at  the  minimum  for  cycle  23.  The  mean 
cycle  period  of  1 1  years  (132  months)  will  be  assumed  for  cycle  23.  Cycle  24  will  then  be 
estimated  using  the  solar  flux  statistics  for  cycles  1  through  22  and  geomagnetic  index 
statistics  for  cycles  13  through  22. 

The  changes  of  orbital  density  associated  with  short-term  (days)  variations  in  the  daily 
10.7-cm  solar  flux  and  the  3-hourly  geomagnetic  index,  required  as  inputs  to  the  upper 
atmospheric  models,  are  not  represented  by  the  13-month  smoothed  statistical  estimates 


E-1 


given  in  these  tables.  This  dynamic  component  of  the  total  atmospheric  density  cannot  be 
estimated  into  the  future  with  any  acceptable  degree  of  statistical  confidence  using  existing 
techniques.  Representative  data  sets,  based  on  past  daily  10.7-cm  solar  flux  and  3-hourly 
geomagnetic  index  values,  may  be  utilized  to  compute  this  dynamic  component  of  the 
orbital  altitude  density. 

The  design  requirements  for  solar  activity  and  associated  on-orbit  density  values  are 
specified  in  the  appropriate  spacecraft  and  space  vehicle  project  design  requirements 
documentation.  These  documents  should  be  consulted  for  this  information. 


Enclosed  are  the  following  tables  and  figures: 

Table  E-1  (Enclosure  1)  contains  values  for  the  recent  monthly  10.7-cm  solar  flux  and 
geomagnetic  index. 

Table  E-2  (Enclosure  2)  contains  the  mid-point  calculated  values  of  the  13-month  smoothed 
observed  10.7-cm  solar  flux. 

Table  E-3  (Enclosure  3)  contains  the  statistical  estimate  of  13-month  smoothed  10.7-cm 
solar  flux  and  geomagnetic  index  for  balance  of  cycle  22,  cycle  23,  and  the  beginning  of 
cycle  24. 

Figure  E-1  (Enclosure  4)  is  a  plot  of  solar  cycle  22  monthly  mean  and  13-month  smoothed 
observed  10.7-cm  solar  flux. 

Figure  E-2  (Enclosure  4)  is  a  plot  of  the  13-month  smoothed  10.7-cm  solar  flux  statistical 
estimates  for  solar  cycles  22,  23,  and  beginning  of  cycle  24. 

The  50  percentile  values  in  table  E-3  and  figure  E-2,  at  a.nd  beyond  miaimum  of  cycle  23, 
are  approximated  by  the  arithmetic  mean  and  given  with  95  percentile  and  5  percentile 
values. 

The  information  in  this  memorandum  is  based  upon  data  received  from  the  National 
Research  Council  of  Canada  for  the  Series  C  10.7-cm  solar  flux  data  and  the  Institute  for 
Geophysics  in  Gottingen,  Germany  for  the  geomagnetic  index  data. 

Questions  on  the  contents  of  this  memorandum  may  be  addressed  to  Harold  Euler  at 
(205)  544-2282. 


APPROVAL: 


4  Enclosures 
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TABLE  E-1 


RECENT  MONTHLY  MEAN  SOLAR  ACTTVITY  VALUES 


1994 


1995 


1996 


Solar  Flux  Geomagnetic 

F,o7  (Series  C)  Index  Ap 


January 

115.0 

15 

February 

99.6 

30 

March 

90.4 

24 

April 

79.1 

29 

May 

79.9 

26 

June 

77.3 

14 

July 

80.5 

11 

August 

76.1 

8 

September 

79.1 

8 

October 

87.7 

22 

November 

80.9 

14 

December 

84.1 

13 

January 

82.7 

11 

February 

85.6 

15 

March 

85.1 

15 

April 

77.7 

16 

May 

75.6 

18 

June 

75.7 

11 

July 

73.8 

8 

August 

73.8 

9 

September 

72.0 

13 

October 

77.9 

16 

November 

74.2 

9 

December 

72.6 

9 

January 

74.5 

9 

February 

71.5 

10 

March 

70.7 

11 

April 

69.1* 

12* 

May 

69.7* 

7* 

June 

69.5* 

6* 

Solar  flux  in  units  of  10“  JANSKY  (where  one  JANSKY  equals  10'“  W  m'^  Hz  '  Bandwidth) 
*  Provisional 
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ENCLOSURE  1 


TABLE  E-2 


13-MONTH  SMOOTHED  10.7-CM  SOLAR  FLUX 


MONTH 

1989 

1990 

1991 

1992 

1993 

1994 

1995 

JANUARY 

190.2 

200.4 

205.5 

181.8 

125.6 

92.7 

80.6 

FEBRUARY 

194.1 

200.5 

206.3 

174.8 

123.0 

91.2 

80.2 

MARCH 

199.7 

198.7 

205.9 

168.5 

120.6 

90.2 

79.9 

APRIL 

204.4 

195.6 

206.8 

162.9 

118.1 

89.3 

79.2 

MAY 

209.3 

192.4 

207.1 

158.8 

114.8 

88.2 

78.5 

JUNE 

213.1 

189.9 

207.4 

154.2 

111.3 

86.7 

77.7 

JULY 

212.6 

190.4 

207.7 

146.6 

109.6 

84.5 

76.9 

AUGUST 

209.7 

193.9 

206.8 

138.9 

107.6 

82.5 

76.0 

SEPTEMBER 

207.2 

198.3 

203.9 

133.7 

103.9 

81.7 

74.8 

OCTOBER 

206.3 

200.6 

199.7 

130.5 

100.4 

81.4 

73.8 

NOVEMBER 

206.1 

201.2 

195.4 

128.2 

97.5 

81.2 

73.2* 

DECEMBER 

203.3 

202.7 

188.9 

127.3 

94.8 

81.0 

72.7* 

*Provisional 

NOTE:  TABLE  E-2  contains  the  13-month  smoothed  10.7-cm  solar  flux  at  the  mid-point  computed  from 
the  National  Research  Council  of  Canada,  Ottawa  and  Penticton  Series  C  observed  monthly  values. 
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ENCLOSURE  2 


TABLE  E-3  ESTIMATES  OF  13-MONTH  SMOOTHED  SOLAR  ACTIVITY  FOR 

BALANCE  OF  CYCLE  22,  CYCLE  23,  AND  BEGINNING  OF  CYCLE  24 


TIME 

10.7-CM 

SOLAR  FLUX 

(  ^10.7) 

GEOMAGNETIC  INDEX 

(  Ap) 

PERCENTILE 

PERCENTILE 

95.0% 

50% 

5.0% 

95.0% 

50% 

5.0% 

1996.0003 

JAN 

74.1 

72.6 

71.3 

10.3 

10.1 

9.7 

1996.0837 

FEB 

75.1 

72.5 

70.2 

10.8 

10.1 

9.3 

1996.1670 

MAR 

76.5 

72.6 

69.5 

11.3 

10.2 

8.8 

1996.2503 

APR 

78.3 

72.8 

68.5 

11.8 

10.2 

8.4 

1996.3337 

MAY 

80.1 

73.0 

67.6 

12.1 

10.2 

8.0 

1996.4170 

JUN 

82.4 

73.3 

66.8 

12.5 

10.2 

7.8 

1996.5003 

JUL 

85.1 

73.7 

66.2 

12.8 

10.1 

7.7 

1996.5837 

AUG 

87.8 

74.2 

65.2 

13.0 

10.2 

7.7 

1996.6670 

SEP 

91.2 

74.8 

64.3 

13.1 

10.2 

7.9 

1996.7503 

OCT 

96.0 

75.7 

63.4 

13.0 

10.2 

8.1 

1996.8337 

NOV 

100.9 

76.9 

62.5 

13.5 

10.3 

8.4 

1996.9170 

DEC 

105.4 

78.2 

61.7 

14.5 

10.4 

8.8 

1997.0003 

JAN 

109.3 

79.5 

60.9 

15.0 

10.5 

8.8 

1997.0837 

FEB 

112.3 

80.7 

60.5 

15.3 

10.6 

8.7 

1997.1670 

MAR 

115.4 

82.1 

60.4 

15.7 

10.8 

8.7 

1997.2503 

APR 

119.4 

83.5 

60.2 

15.9 

11.0 

8.5 

1997.3337 

MAY 

123.7 

85.1 

60.3 

15.9 

11.1 

8.5 

1997.4170 

JUN 

128.4 

86.7 

60.6 

16.2 

11.1 

8.5 

1997.5003 

JUL 

134.1 

88.6 

60.8 

16.4 

11.2 

8.7 

1997.5837 

AUG 

140.1 

90.8 

61.3 

16.4 

11.4 

8.9 

1997.6670 

SEP 

144.8 

93.0 

62.3 

16.5 

11.7 

9.1 

1997.7503 

OCT 

148.1 

95.1 

63.3 

16.8 

12.0 

9.5 

1997.8337 

NOV 

150.1 

97.2 

64.9 

16.9 

12.4 

10.0 

1997.9170 

DEC 

152.1 

99.2 

66.8 

16.4 

12.7 

10.6 

1998.0003 

JAN 

154.9 

101.4 

68.3 

15.8 

12.8 

10.9 

1998.0837 

FEB 

158.2 

103.7 

69.6 

15.6 

12.9 

10.4 

1998.1670 

MAR 

161.5 

105.9 

71.4 

15.7 

13.1 

10.4 

1998.2503 

APR 

165.1 

108.2 

73.1 

15.7 

13.3 

10.4 

1998.3337 

MAY 

170.3 

110.6 

73.7 

16.0 

13.4 

10.4 

1998.4170 

JUN 

175.6 

113.0 

74.3 

16.3 

13.5 

10.4 

1998.5003 

JUL 

178.8 

115.5 

76.9 

16.5 

13.5 

10.7 

1998.5837 

AUG 

180.4 

118.0 

78.5 

16.4 

13.5 

11.0 

1998.6670 

SEP 

182.0 

120.4 

80.0 

16.2 

13.3 

11.2 

1998.7503 

OCT 

183.6 

122.9 

81.6 

16.3 

13.3 

11.2 

1998.8337 

NOV 

185.1 

125.2 

83.1 

16.4 

13.3 

11.3 

1998.9170 

DEC 

186.6 

127.6 

84.6 

16.5 

13.2 

11.4 

1999.0003 

JAN 

188.1 

129.9 

86.0 

16.6 

13.2 

11.5 

1999.0837 

FEB 

189.6 

132.2 

87.5 

16.7 

13.2 

11.5 

1999.1670 

MAR 

191.0 

134.3 

88.9 

16.8 

13.2 

11.6 

1999.2503 

APR 

192.3 

136.4 

90.2 

16.9 

13.2 

11.7 

1999.3337 

MAY 

193.6 

138.4 

91.5 

17.0 

13.1 

11.8 

1999.4170 

JUN 

194.9 

140.3 

92.7 

17.1 

13.1 

11.8 

1999.5003 

JUL 

196.0 

142.1 

93.8 

17.2 

13.1 

11.9 

1999.5837 

AUG 

197.1 

143.8 

94.9 

17.3 

13.1 

11.9 

1999.6670 

SEP 

198.1 

145.3 

95.8 

17.3 

13.1 

12.0 

1999.7503 

OCT 

199.0 

146.7 

96.7 

17.4 

13.1 

12.0 

1999.8337 

NOV 

199.8 

147.9 

97.5 

17.5 

13.0 

12.1 

1999.9170 

DEC 

200.5 

149.0 

98.2 

17.5 

13.0 

12.1 

2000.0003 

JAN 

201.1 

149.9 

98.8 

17.5 

13.0 

12.1 

2000.0837 

FEB 

201.5 

150.6 

99.2 

17.6 

13.0 

12.2 
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TABLE  E-3  ESTIMATES  OF  13-MONTH  SMOOTHED  SOLAR  ACTIVITY  FOR 

BALANCE  OF  CYCLE  22,  CYCLE  23,  AND  BEGINNING  OF  CYCLE  24 


TIME 

10.7-CM 

SOLAR  FLUX 

(  ^10.7) 

GEOMAGNETIC  INDEX 

(  Ap) 

PERCENTILE 

PERCENTILE 

95.0% 

50% 

5.0% 

95.0% 

50% 

5.0% 

2000.1670 

MAR 

201.9 

151.2 

99.6 

17.6 

13.0 

12.2 

2000.2503 

APR 

202.1 

151.5 

99.8 

17.6 

13.0 

12.2 

2000.3337 

MAY 

202.2 

151.6 

99.8 

17.6 

13.0 

12.2 

2000.4170 

JUN 

239.0 

158.1 

98.5 

20.0 

15.2 

11.9 

2000.5003 

JUL 

234.3 

156.1 

97.8 

19.7 

15.2 

12.2 

2000.5837 

AUG 

230.8 

154.2 

96.7 

19.3 

15.3 

12.2 

2000.6670 

SEP 

230.3 

152.7 

95.3 

19.0 

15.5 

12.3 

2000.7503 

OCT 

230.0 

151.1 

93.7 

18.8 

15.6 

12.5 

2000.8337 

NOV 

227.8 

149  .'7 

92.5 

18.6 

15.6 

12.1 

2000.9170 

DEC 

225.6 

148.1 

91.6 

18.6 

15.6 

11.5 

2001.0003 

JAN 

224.8 

146.6 

90.8 

18.5 

15.6 

11.4 

2001.0837 

FEB 

223.7 

144.9 

90.0 

18.6 

15.6 

11.4 

2001.1670 

MAR 

222.6 

143.0 

89.0 

18.8 

15.6 

11.4 

2001.2503 

APR 

219.3 

140.8 

88.3 

19.3 

15.7 

11.4 

2001.3337 

MAY 

214.8 

138.7 

87.4 

19.9 

15.8 

11.5 

2001.4170 

JUN 

211.2 

137.0 

86.0 

20.8 

16.0 

11.5 

2001.5003 

JUL 

205.9 

135.8 

85.3 

21.3 

16.2 

11.4 

2001.5837 

AUG 

200.9 

134.7 

84.7 

21.1 

16.3 

11.3 

2001.6670 

SEP 

196.3 

133.7 

83.7 

20.8 

16.2 

11.1 

2001.7503 

OCT 

191.4 

132.8 

82.4 

21.3 

16.3 

11.1 

2001.8337 

NOV 

187.0 

131.4 

81.0 

22.4 

16.5 

11.2 

2001.9170 

DEC 

182.3 

129.7 

80.1 

22.6 

16.7 

11.7 

2002.0003 

JAN 

177.9 

127.8 

79.5 

22.2 

16.9 

12.4 

2002.0837 

FEB 

176.3 

126.0 

78.8 

22.0 

16,8 

12.5 

2002.1670 

MAR 

176.1 

124.4 

78.0 

22.1 

16.8 

12.6 

2002.2503 

APR 

176.7 

123.0 

77.1 

22.7 

17.0 

12.7 

2002.3337 

MAY 

177.8 

121.9 

76.1 

23.3 

17.3 

12.8 

2002.4170 

JUN 

177.3 

120.5 

75.2 

23.3 

17.3 

13.2 

2002.5003 

JUL 

175.1 

118.6 

74.3 

23.3 

17.2 

13.9 

2002.5837 

AUG 

171.4 

116.7 

73.3 

22.8 

17.2 

14.1 

2002.6670 

SEP 

166.1 

114.8 

72.3 

21.2 

17.0 

13.7 

2002.7503 

OCT 

161.6 

113.2 

71.8 

21.3 

16.7 

13.2 

2002.8337 

NOV 

160.0 

111.8 

71.5 

21.2 

16.7 

13.1 

2002.9170 

DEC 

159.0 

110.3 

71.1 

21.0 

16.6 

13,0 

2003.0003 

JAN 

157.4 

108.9 

70.9 

20.5 

16.4 

12.8 

2003.0837 

FEB 

154.7 

107.4 

71.0 

20.0 

16.2 

12.6 

2003.1670 

MAR 

150.4 

105.6 

70.9 

19.4 

15.8 

12.3 

2003.2503 

APR 

144.8 

103.9 

70.7 

18.7 

15.6 

12.3 

2003.3337 

MAY 

139.0 

102.5 

70.5 

17.9 

15.3 

12.2 

2003.4170 

JUN 

133.3 

101.3 

70.4 

17 . 6 

15.1 

12.2 

2003.5003 

JUL 

128.5 

100.1 

70.3 

17 . 9 

14.9 

12.1 

2003.5837 

AUG 

124.6 

98.8 

70.6 

18.3 

14.7 

11.9 

2003.6670 

SEP 

121.5 

97.6 

71.0 

18.5 

14.6 

12.0 

2003.7503 

OCT 

119.3 

96.6 

71.1 

19.1 

14.6 

11.5 

2003.8337 

NOV 

117.9 

95.5 

71.3 

19.6 

14.6 

11.1 

2003.9170 

DEC 

116.4 

94.6 

71.1 

20.0 

14.8 

11.4 

2004.0003 

JAN 

117.6 

93.5 

70,6 

20.4 

15.0 

11.4 

2004.0837 

FEB 

117.2 

92.4 

70.3 

20.6 

15.1 

11.4 

2004.1670 

MAR 

116.3 

91.6 

69.9 

20.9 

15.4 

11.6 

2004.2503 

APR 

116.8 

90.9 

69.5 

21.6 

15.6 

11.7 
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TABLE  E-3  ESTIMATES  OF  13 -MONTH  SMOOTHED  SOLAR  ACTIVITY  FOR 

BALANCE  OF  CYCLE  22,  CYCLE  23,  AND  BEGINNING  OF  CYCLE  24 


TIME 

10.7-CM 

SOLAR  FLUX 

(  ^10.7) 

GEOMAGNETIC  INDEX 

(  Ap) 

PERCENTILE 

PERCENTILE 

95.0% 

50% 

5.0% 

95.0% 

50% 

5.0% 

2004.3337 

MAY 

117.3 

89.9 

69.2 

21.8 

15.8 

11.7 

2004.4170 

JUN 

117.6 

89.0 

68.9 

21.8 

16.0 

11.6 

2004.5003 

JUL 

117.4 

88.0 

68.6 

22.0 

16.3 

11.6 

2004.5837 

AUG 

116.7 

86.9 

68.2 

22.1 

16.6 

11.9 

2004.6670 

SEP 

115.5 

85.9 

67.9 

22.4 

16.9 

11.9 

2004.7503 

OCT 

113.9 

85.1 

67.7 

23.0 

17.0 

11.9 

2004.8337 

NOV 

110.7 

84.1 

67.6 

23.6 

17.0 

12.4 

2004.9170 

DEC 

105.3 

83.0 

67.5 

24.1 

16.9 

12.8 

2005.0003 

JAN 

99.5 

82.0 

67.5 

24.3 

16.8 

12.8 

2005.0837 

FEB 

98.3 

81.0 

67.3 

24.1 

16.5 

12.7 

2005.1670 

MAR 

98.8 

80.2 

67.2 

23.4 

16.0 

12.8 

2005.2503 

APR 

99.9 

79.7 

67.2 

22.5 

15.6 

12.8 

2005.3337 

MAY 

100.4 

79.2 

67.2 

21.7 

15.3 

12.8 

2005.4170 

JUN 

98.8 

78.6 

67.2 

21.4 

14.9 

12.4 

2005.5003 

JUL 

95.8 

78.1 

67.1 

20.9 

14.5 

11.8 

2005.5837 

AUG 

93.2 

77.6 

67.1 

20.5 

14.1 

11.2 

2005.6670 

SEP 

91.7 

77.1 

67.0 

19.9 

13.6 

10.5 

2005.7503 

OCT 

91.6 

76.5 

67.0 

19.0 

13.1 

9.9 

2005.8337 

NOV 

91.1 

76.0 

67.0 

17.9 

12.7 

9.4 

2005.9170 

DEC 

90.7 

75.6 

67.0 

16.8 

12.3 

8.9 

2006.0003 

JAN 

90.0 

75.1 

67.0 

16.2 

12.0 

8.5 

2006.0837 

FEB 

89.1 

74.7 

67.1 

16.1 

11.8 

8.3 

2006.1670 

MAR 

88.4 

74.4 

67.3 

16.4 

11.8 

8.0 

2006.2503 

APR 

87.5 

74.1 

67.3 

16.6 

11.6 

7.6 

2006.3337 

MAY 

86.1 

73.7 

67.4 

16.3 

11.4 

7.3 

2006.4170 

JUN 

84.6 

73.3 

67.5 

16.1 

11.3 

7.2 

2006.5003 

JUL 

82.5 

72.9 

67.6 

15.9 

11.3 

7.1 

2006.5837 

AUG 

80.0 

72.5 

67.7 

15.4 

11.2 

7.2 

2006.6670 

SEP 

77.7 

72.1 

68.0 

15.2 

11.2 

7.4 

2006.7503 

OCT 

77.4 

71.9 

68.0 

15.3 

11.2 

7.6 

2006.8337 

NOV 

77.0 

71.7 

68.0 

15.8 

11.2 

7.8 

2006.9170 

DEC 

77.5 

71.6 

68.0 

16.0 

11.3 

7.9 

2007.0003 

JAN 

78.4 

71.5 

67.9 

16.2 

11.3 

7.9 

2007.0837 

FEB 

80.0 

71.6 

67.9 

16.4 

11.3 

8.1 

2007.1670 

MAR 

82.1 

71.8 

67.7 

16.5 

11.3 

8.1 

2007.2503 

APR 

84.0 

72.0 

67.6 

16.8 

11.2 

8.1 

2007.3337 

MAY 

86.4 

72.3 

67.7 

17.0 

11.2 

8.1 

2007.4170 

JUN 

89.3 

72.7 

67.6 

17.1 

11.1 

8.2 

2007.5003 

JUL 

92.3 

73.0 

67.6 

17.1 

11.1 

8.2 

2007.5837 

AUG 

95.8 

73.5 

67.6 

16.8 

11.0 

8.1 

2007.6670 

SEP 

101.2 

74.2 

67.9 

16.5 

11.0 

8.0 

2007.7503 

OCT 

106.7 

75.2 

68.1 

15.7 

10.9 

8.0 

2007.8337 

NOV 

112.0 

76.3 

68.5 

15.0 

10.9 

8.0 

2007.9170 

DEC 

116.6 

77.3 

68.5 

15.5 

10.9 

8.2 

2008.0003 

JAN 

120.1 

78.5 

68.8 

15.5 

10.9 

8.2 

2008.0837 

FEB 

123.6 

79.7 

69.1 

15.7 

11.0 

8.3 

2008.1670 

MAR 

127.8 

81.0 

69.3 

15.9 

11.1 

8.2 

2008.2503 

APR 

132.5 

82.5 

69.5 

16.0 

11.2 

8.3 

2008.3337 

MAY 

137.4 

84.1 

70.2 

16.2 

11.2 

8.4 

2008.4170 

JUN 

143.4 

85.9 

71.0 

16.4 

11.3 

8.6 
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TABLE  E-3  ESTIMATES  OF  13 -MONTH  SMOOTHED  SOLAR  ACTIVITY  FOR 

BALANCE  OF  CYCLE  22,  CYCLE  23,  AND  BEGINNING  OF  CYCLE  24 


TIME 

10.7-CM 

SOLAR  FLUX 

(  ^10.7  ) 

GEOMAGNETIC  INDEX 

(  Ap) 

PERCENTILE 

PERCENTILE 

95.0% 

50% 

5.0% 

95.0% 

50% 

5.0% 

2008.5003 

JUL 

149.7 

87.9 

71.8 

16.4 

11.5 

8.9 

2008.5837 

AUG 

154.6 

90.1 

72.5 

16.5 

11.7 

9.1 

2008.6670 

SEP 

158.2 

92.2 

73.7 

16.8 

12.0 

9.6 

2008.7503 

OCT 

160.3 

94.2 

74.2 

16.8 

12.3 

10.1 

2008.8337 

NOV 

162.1 

96.2 

74.4 

16.3 

12.5 

10.4 

2008.9170 

DEC 

164.4 

98.6 

74.7 

15.8 

12.7 

10.4 

2009.0003 

JAN 

167.7 

100.9 

74.9 

15.6 

12.9 

10.3 

2009.0837 

FEB 

171.4 

103.0 

75.0 

15.7 

13.1 

10.4 

2009.1670 

MAR 

174.9 

105.3 

74.9 

15.8 

13.3 

10.5 

2009.2503 

APR 

179.5 

107.5 

74.8 

16.2 

13.4 

10.6 

2009.3337 

MAY 

185.0 

109.9 

74.7 

16.5 

13.5 

10.7 

2009.4170 

JUN 

188.6 

112.2 

75.0 

16.9 

13.6 

11.2 

2009.5003 

JUL 

190.3 

114.7 

75.3 

17.4 

13.7 

12.1 

2009.5837 

AUG 

190.7 

117.3 

75.6 

17.7 

13.7 

12.0 

2009.6670 

SEP 

191.0 

119.6 

76.6 

17.7 

13.6 

11.4 

2009.7503 

OCT 

193.5 

121.6 

77.4 

17.9 

13.6 

10.9 

2009.8337 

NOV 

196.6 

123.2 

77.7 

18.2 

13.7 

10.6 

2009.9170 

DEC 

199.1 

124.9 

78.7 

17.9 

13.7 

10.5 

2010.0003 

JAN 

202.7 

126.8 

80.0 

17.3 

13.7 

10.9 

2010.0837 

FEB 

208.4 

128.8 

81.5 

17.6 

13.8 

11.0 

2010.1670 

MAR 

212.9 

131.0 

83.5 

18.4 

13.8 

10.5 

2010.2503 

APR 

215.1 

133.1 

84.8 

18.6 

13.9 

10.5 

2010.3337 

MAY 

218.6 

135.5 

85.8 

19.1 

14.2 

10.7 

2010.4170 

JUN 

223.7 

138.3 

87.7 

19.9 

14.4 

10.9 

2010.5003 

JUL 

226.5 

141.0 

89.5 

19.6 

14.3 

11.0 

2010.5837 

AUG 

228.2 

143.2 

90.8 

19.7 

14.2 

10.7 

2010.6670 

SEP 

230.2 

145.6 

93.2 

19.9 

14.3 

10.8 

2010.7503 

OCT 

232.2 

148.2 

95.7 

20.3 

14.3 

10.5 

2010.8337 

NOV 

235.2 

150.7 

97.5 

20.5 

14.4 

10.4 

2010.9170 

DEC 

238.5 

152.9 

98.7 

20.5 

14.6 

11.0 

2011.0003 

JAN 

240.7 

154.8 

98.7 

20.8 

14.8 

11.4 

2011.0837 

FEB 

240.1 

156.6 

98.1 

21.1 

15.0 

11.4 

2011.1670 

MAR 

239.8 

158.8 

97.7 

21.6 

15.2 

11.5 

2011.2503 

APR 

241.6 

161.0 

99.1 

22 . 1 

15.5 

11,7 

2011.3337 

MAY 

241.6 

161.0 

99.1 

22.1 

15.5 

11.7 

NASA/MSFC/EL23 


E-8 


ENCLOSURE  3 
(4  of  4) 


13-Month  Smoothed  Solar  Flux  10.7-cm  Solar  Flux  Series 
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Figure  E-1.  Plot  of  Recent  Monthly  Mean  and  13-Month  Smoothed  Solar  Flux 


Date 


Figure  E-2.  Estimate  of  13-Month  Smoothed  Solar  Flux  for  Balance  of  Cycle  22,  23,  and 
Beginning  of  Cycle  24 
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ENCLOSURE  4 


APPENDIX  F 


MSFC  Lagrangian  Linear  Regression  Technique  Historical  Plots 


MSFC  Lagrangian  Linear  Regression  Technique  (MLLRT)  Historical  Plots  were  developed  to 
provide  an  assessment  of  the  MLLRT  over  the  last  three  cycles.  The  plots  are  made  starting  on  the 
month  and  year  specified  on  each  plot.  These  plots  are  made  for  one  month  for  each  year  during 
cycles  20, 21,  and  22.  The  plots  show  the  estimates  as  in  the  MSFC  MonAly  Solar  Activity 
memorandum  with  the  observed  13-Month  Smoothed  Fjo  ,  added.  Each  estimate  plot  is  for  five  years. 
This  appendix  contains  also  the  completed  cycle  plots  for  Ae  minimum  to  minimum  cycles  and  the 
maximum  to  maximum  cycles.  Concerning  the  maximum  to  maximum  cycle  plots,  the  cycles  are 
identified  by  cycle  number  at  maximum  initialization  date.  For  example,  maximum  to  maximum  cycle 
20  starts  on  July  1970  and  finishes  at  the  maximum  estimate  of  cycle  21 .  Concerning  the  minimum  to 
minimum  cycle  plots,  the  cycles  are  identified  by  cycle  number  at  minimum  initialization  date,  e.  g., 
minimum  to  minimum  cycle  20  starts  on  October  1964  and  finishes  at  the  minimum  estimate  of  cycle 
21. 


F-1 
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Figure  F-1.  Minimum  to  Minimum  Cycle  20  MLLRT  Model  Results  Evaluation. 
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Figure  F-2.  Maximum  to  Maximum  Cycle  20  MLLRT  Model  Results  Evaluation. 


Figure  F-3.  Minimum  to  Minimum  Cycle  21  MLLRT  Model  Results  Evaluation. 
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Figure  F-4.  Maximum  to  Maximum  Cycle  21  MLLRT  Model  Results  Evaluation. 
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Figure  F-6.  Maximum  to  Maximum  Cycle  22  MLLRT  Model  Results  Evaluation. 
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